Apr 19, 2010

Using market data with care

Lots of ways are known to meet market data on the internet: you can download free EOD

data and analyze it with a free software, or maybe you pay for a premium chart software

using a data stream, or perhaps you use your trading system's charting service, but there

are common things you have to know while using market data:

4.1 The observed security

You have to know exactly what you are analyzing. Never mistake similarly named securities

like different series of shares and bonds of the same company. Some securities are also

traded on more exchanges which are not priced completely alike, so you have to be sure

that the data belongs to the security you are going to trade. To avoid confusion, we

suggest the use of ISIN code where possible:

ISIN: International Securities Identification Number, a code that identifies a stock

uniquely. A security has different ISIN code for each stock exchange it is traded at.

For more information see ISIN code's Wikipedia page. You can get a stock's ISIN

code eg.: from the exchange where it is traded.

4.2 Data update

How up-to-date the data is depends on many factors, which are:

• Time of market closure: In many cases market closure happens in the afternoon

(according to local time) but if trading is continuous (eg.: like in the case of

commodities) EOD data is summarized at midnight.

• Delay of data: it is an important quality factor of the market data. Intraday data

usually comes with known delay (real-time, 15 minutes etc.). EOD data's delay may

be less than 1 hour or more than a couple days. At Chartoasis.com we always list

EOD data providers with the least known delay.

• Data update of EOD data: Some providers tell market data before market closure -

in this case closing price is the actual price of the equity on the market. It may

happen, that closing price is available before market closure but volume data is not

accessible. If market data that you want to analyze is available at different providers

you have to choose the one that suits you better. Visit

http://www.chartoasis.com/free-analysis-software/free-data.html for alternative

providers of free market data (the list is constantly improving).

4.3 Errors in market data

We think it is very important to inform users about inconsistencies found within the data.

• As we told it in chapter 3Errors in market data, errors do happen. Any time we made

some automatized consistency check on a large enough sample data of a provider

we always find some problems. You may meet erroneous data, too.

• If you are warned of an error found in the market data (eg.: by Chartoasis.com's

charting tool), check how far the error's timestamp is from the observed time

interval. When analyzing the actual week, an error that happened yesterday can

influence the indicators much more than an error that happened 20 years ago. The

latest error is the most important one, that is why Chartoasis.com's chart software

reports the latest error.

• The scale of the error: it is very hard to tell how large impact an error can have

because calculations made on the faulty data can increase or decrease the error's

effect. (eg.: if the error is that the closing price equals with the daily maximum, the

%K in the stochastic oscillator is 100% but if closing price equals with the daily

minimum, it is 0%. It maybe that there's only 1-2% difference between the right and

the erroneous value but it can cause a large change in the value of the indicator.)

• Extraordinary events: Events causing extraordinary close of market may result in

inconsistent data (see 9/11's example above). If something like that happens make

sure you can trust your data.

• Volume information: you must know which markets are represented by the trading

volume listed in the market before using trading volume dependent indicators (eg.:

EUR/USD or oil are traded on many platforms). It is also important that some vendors

mean the turnover (total value of the traded instruments) by trading volume and others

mean the total number of the traded instrument. Chartoasis.com's free technical

analysis software always loads the traded number of shares where it is possible.

Remember that Yahoo! Finance returns the daily average volume instead of the total

volume when downloading weekly market data.

• Currencies, numbers, prices: Currency of prices in downloaded data may not be

univocal in all cases since some data providers provide data in more currencies. It may

happen that you download data of Gazprom from RTS in USD but the dividend is given

in RUR on the website of the company. Numbers may be rounded, too.

• There can be numbers that are contradicting for the first sight. NSEINDIA.COM's

data for equities contain 'close price' and 'last price' columns. 'Close price' is the

weightaged average price of the last half an hour and 'last price' is the last trade's

price. You always have to check information like this.

• Amount of data: using too few data may cause inaccuracy of indicators and functions

of market data. When selecting download interval take care that some methods depend

Minimum, maximum, open and close values

As the name says maximum value is the maximum and minimum is the minimum within

the period (15 minute, day, etc.). Minimum values must be less than or equal to maximum

value, while open and close values must be between min. and max. All prices must be

positive numbers.

Obvious, isn't it? But there is some reason for wasting time for it.

We experienced that these fields can be erroneous. It happens time to time, that for some

days these values that are recorded are faulty.

These errors may come from the old times when there were no computers logging the

market events. The likelihood of such errors is very high in data files of long time periods

(eg.: you can download Dow Jones Industrial Average's data since 1928).

Computers can also be wrong when they are set up incorrectly. Unfortunately if computers

are wrong they do that consistently:

We have already found a data provider in our practice who consistently replaced the

minimum and the maximum column in the officially released data. Just imagine what kind

of effect it could have on the analysis. We warned the provider to the error but it took them

more than a year to replace the two columns!

Errors like this are reported by Chartoasis.com's free technical analysis software, too.

If there has been any transaction, the number of the shares that have been bought and

sold should be positive.

From the aspect of market data, this is just partially true. Some providers do not have so

much information about trading volume as much information they have about pricing. This

means for example, that they have pricing information about the past 10 years, but they

have volume data only for the last 5 years. In this case the missing volume information is

filled with zeros.

Sometimes volume is not available at all, so this is not really an error.

Transaction list contains all necessary information to derive intraday data, EOD data and

weekly data.

Any time we are asked to work with some data that could be tested against other data, we

do the test and we almost always find some errors. This was not only about testing data

but about testing our own software, too. When an error is found we always check it with

manual calculations.

First we found that the providers follow different manner when providing weekly data. This

means that eg.: Yahoo! Finance provides weekly trading volume as an average daily volume

over the week but other providers mean the total traded volume of the week.

Chartoasis.com's free technical analysis software calculates weekly data from the EOD data

to avoid this kind of confusion and to make usage more comfortable. There is no need to

download weekly data since our experiences prove that EOD data should be trusted more.

When comparing official EOD data with the one derived from the tick by tick data directly,

we found that either the minimum, or the closing price or the volume etc. can be wrong. The

errors were rare and not too big, but the comparison revealed missing data, too.

When comparing weekly data derived from the EOD data with official weekly data, we found

rare and small problems except for the weeks around 9/11, when total chaos ruled. We found

missing weeks (there were days when market was open and EOD data was available but

there was no weekly data for that week), volume data turned upside down etc. See the

example below.

What is an error in the market data?

We have mentioned many kinds of data in the previous chapter, like opening price, volume,

date etc. If you download the market data as a file you see a lot of numbers. Are these

numbers all OK? When making an investment decision based on market data it is an

important question if your market data is valid or not. Our experience says that market

data is error-free in most cases. But what if you accidentally make your decision based on

false data? What if the signal you read from an indicator is because the indicator magnified

a data error? There are many reasons for such errors: extraordinary events, human error,

badly recorded data from the past...

The answer to the question "How you can check if the market data is valid?" is simple: the

values in market data are not independent, there are relations that must apply to them.

These relations can be checked before using market data and you can be warned if there is

some problem. This does not guarantee that all values in the data are true but it is a good

way to check that the data does not contain "garbage".

Timestamp means the time of an event, like the time of a transaction or the time of a

change in the order book. Date is a kind of timestamp, too. Let's see what can be

demanded from timestamps!

Timestamps in the market data must follow each other in a rising or falling order.

For intraday data it is possible that a lot of timestamps are missing eg.: if the instrument is

not too liquid. There are stock exchanges where the trading of a stock is stopped when the

price changes too fast or an important announcement is made.

For daily data it is a natural thing that some day's market information is missing since there

are national / religious holidays and weekends. It is also possible in some countries (eg.:

Hungary) that a workweek is temporarily made 6 days long to make other holidays longer.

(If you are trading in a foreign country you should be aware of the local customs of

holidays.) These anomalies together can not cause the lack of data for more than a week. A

date must not occur more than once.

It is less likely that market data is missing for a lot of days, since data providers usually

repeat previous closing price for days where no trades have happened while the exchange

was open.

Error must be suspected when a week or two is missing from the analysis. This often

happens along with a jump in the price as seen in the figure below:

Errors like this are reported by Chartoasis.com's technical analysis software.

We have already met cases in our practice when one data provider had market data for a

day while the other did not. Our inspection revealed that the mentioned day was an officially

announced market holiday, so the provider with the missing data was right! The other

provider's data was a copy of a different day's data.