how to treat with <s> and </s> in calculating unigram LM?

I am beginner in NLP and I'm confused how to treat with <s> and </s> symbols to calculate counts for unigram model? should I count them or just ignore?

Solution

If I understand correctly that <s> and </s> mean special (fake) unigrams as the first and the last unigrams (actually, pre-first and after-last) for each text, then there is no need in them for unigrams, because any string contains these unigrams and thus they provide no additional information.

Such special unigrams can be useful in case of high-order n-grams: for example, it allows to extract from the 1-word string like hello 2 bigrams: <s> hello and hello </s> or 3 trigrams: <s0> <s1> hello, <s1> hello </s1>,hello </s1> </s0>.

I want to install the "n" package and I get an error
n <version> command does not activate specified version
Change n install location
How to install a specific version of Node on Ubuntu/Debian?
Different node version for different projects, is there a way of telling node which version to use?
Install Node.js to install n to install Node.js?
How to select the latest node.js v6 version using n?
n-install: ERROR: GNU Make not found, which is required for operation
How to downgrade Node version with n
how switch to previous version in n (Node version manager)?
Automatically use the right version of Node for a package
internal/modules/cjs/loader.js:905 -> throw err;
Why doesn't "n" downgrade my node version on a Mac?
Node version manager
n failed to install/switch node in Linux?
vue command not found on Mac
How to uninstall n and all node versions installed by n
Angular CLI on HTTPS - can't install CI as root
n (node version manager): cannot create directory
npm module n emits errors
How to update npm permanently?
Cannot change nodejs version using n
upgrade nodejs to stable version
How should I install and use multiple versions of Node on the same production machine?