The Chan (JFE 2003) news data contains two parts.  1) A zipped file called "newssmall.zip", which contains a SAS dataset "small.sas7bdat" with the following columns: pubname - this states the news source dt - this is the date (DDMMMYYYY format) words -this is the number of words in the article (as reported by Dow Jones Interactive) permno - this is the CRSP stock permno. Note that the news data only applies to a random subset of CRSP stocks (around 1/5 to 1/4 of the CRSP universe at any time) from 1980-2000, and only from selected news sources (mainly Dow Jones).  2) The subset of all CRSP all permnos that were searched for news, whether any stories were found or not, is listed in the file "crsp16a.csv.  Stocks is in the CSV file but not in the news dataset on certain dates had no news during that day.  Note some stocks never have any news at all since they are in the csv file but not in the news database.