The market economy of Russia is very heterogeneous, both in terms of business attractiveness of various commodity categories, and the degree of entrepreneurial activity in various regions of the country. The information system proposed below allows to identify the commodity groups most demanded on the Russian market, as well as to determine the regional differences in the intensity of business activity. It makes possible trend analysis (starting in February 1995) and tracking other developments in the entrepreneurial activities in the Russian marketplace.
The source of information in this system is represented by the daily data on a few indicators that characterize the informational activity of businesses in the Russian market. The analysis covers that portion of the information activity which takes the form of messages sent by entrepreneurs through the commercial newsgroups of the Russian national computer network RELCOM. The indicators are based on the estimation of the intensity of the flows of commercial messages within individual commodity categories and across major economic regions whose entrepreneurs are active in the Russian market. For each indicator, the information system allows to obtain (within a given time range): 1)ratings of commodity groups or economic regions: 2)graphical representation of the dynamics of the indicators. The initial data is updated once every month and available for outside users (in the ASCII format). When creating ratings and time graphs of the changes in the indicators, the system users can change the time range and select the list of commodity groups and economic regions to appear on the graphs.
Official indicators of business activity are not well developed in Russia. We offer several indirect indicators which can be used to understand the type and the extent of commercial activity, as well as the overall state of the market. One source of such data is the Russian computer network RELCOM which, given the generally poor quality of communications available in Russia, is an indispensable source of information for the majority of Russian business people.
Business messages communicated through the RELCOM network are routine advertised offers to sell or buy a specific good or service. On average, the price of communicating such a message is considerably less than $1.00. In addition, a customer can send his message even if he does not have a computer and the access to the network - through firms providing such services. The average cost of the subscription to the network is $20.00. The average cost of the equipment necessary to hook up to the network does not exceed $600 - $700. The nation-wide network RELCOM covers the entire territory of Russia, and is accessible by all the Russian-speaking INTERNET users. In terms of the accumulated user base, RELCOM is the unquestionable leader compared to the other similar Russian computer networks. In the first quarter of 1995 alone, more than 140,000 "commercial propositions" were communicated over the network. The user addresses are structured to include the name of the region(s) where the users are located. So the biggest advantage of using this source of data is its regional content which provides a reasonably accurate profile of local business activities.
The RELCOM network is one of a few nation-wide information technologies available in Russia (in addition to the two national television channels and several newspapers and magazines) that form a unified Russian marketplace on the vast territory of the Euroasian continent (for instance, the time difference between the west- and east-most time zones is equal to 12 hours). It is collecting cross-regional (12 Russian regions and 14 ex-Soviet republics) and cross-sectional ( 21 commodity groups) sets of commercial propositions (on the daily basis since February 1995).
The system is organized in three tiers (levels). The first (master) "page" of the system allows to choose the time range for the indicator analysis (see the Section "Time Range"), to select the type of the indicator to be analyzed, and, by pressing the button Submit Query, to activate the next (second) "page" of the system. In addition, within the first "page", a user can obtain commentaries to the system and the initial data that can be used outside the system (for the restrictions on use of this data, see the Section Copyright).
Statistics is gathering and processing since February 1995, so the least left margin may be set to 9502.
The default value is all the period of statistics processing.
Having selected one of index types and pressed Submit button, you'll get the page of relevant subtypes and their rating. To get the graph you'll have to check one or more subtypes and submit the query on the second page.
The first four out of six types of indicators shown on the first page can be selected in one of the two options: 1)absolute value of the indicator (amount), or 2)daily rate of growth of the indicator (index). This selection affects the graphical representation of the indicators. In the first case, the graphs of absolute values are constructed, and in the second - the graphs of growth rates are built.
The indicator
The list of commodity groups:
Individual values of this indicator for a given commodity
group correspond to the daily numbers of communications
directed to this newsgroup. The index of these values
then yields the daily growth rate.
The indicator
Russian's regions
Ex-USSR regions
The assignment of individual communications to
appropriate regions is based on the name of the
second-level domain in the address of the sender. For
example, messages with return addresses containing
*.msk.su, *.msu.ru will be referred to the region of
Moscow. Communications with undetermined regional
identity are grouped into the category UNKNOWN. Moreover,
there are two groups of communications with the first
-level domain names "org" and "com" respectively in their
return addresses which usually originate in foreign
countries other than the former Soviet republics. To
simplify the analysis, the regions of Russia with low
information activity are grouped under the heading "rest
of Russia"; the similar non-Russian regions form the
group called "rest of world". Individual values of this
indicator for a given economic region correspond to the
daily numbers of communications originated in the given
region. The index of these values measures the daily
growth rate.
The indicator
The indicator
The indicator
The indicator
The indicator
The indicator
The indicator
The indicator
The indicator
The indicator
The indicator
Executing Submit Query on the first page opens the second
page of the system on the screen. The second page
displays the table containing the following attributes of
the indicator selected by the user:
- the list of commodity groups or regions (depending on
the type of the indicator chosen); for the indicator Avg.
traveling time, the second page presents its graph;
- in the list, individual groups or regions are placed in
the descending order in accordance with their ratings
(i.e. commodity groups or regions in the top portion of
the list have higher ratings than those placed below
them).
- for each commodity group or region, the value of the
parameter used to calculate the rating;
- this parameter is taken to be equal to the total number
of communications within each individual commodity group
or region during the time range chosen by the user (for
changes in the time range chosen, see below).
- for each commodity group or region, rate of growth in
the indicator selected on the first page during the last
month, calculated as (A(t)/A(t-1) - 1)/100%, where A is
the total number of communications in a given month, t -
the last month of the selected time range, t-1 - the
month immediately preceding the last month in the chosen
time range.
For some types of data, monthly rates of growth (column
"ind") are not determined and, therefore, are absent on
the second page (indicators 4 - 6 in the list on the
first page of the system).
The second page allows: 1) to rearrange the list of
commodity groups or regions for the time range entered by
the user; 2) to view the graphs of changes in a given
indicator for desired commodity groups or regions.
To reevaluate the ratings of commodity groups or regions,
one must adjust appropriately the time range found on the
second page and press the button Submit Query. While
doing this, no commodity groups or regions may be
highlighted.
If at least one commodity group or region is highlighted,
pressing Submit Query opens the third page of the system
that allows to view the time graphs of the highlighted
entries (for the selected time range). If necessary, the
time range for the graphs can be changed directly from
the second page (to do this, the user should enter the
desired new value for the time range in the window "time
range").
The third page only displays graphs and does not contain
any commands. Return to the previous page is accomplished
by the standard methods of the Internet browser (for
example, in Netscape, it is done by pressing the button
Back or selecting the appropriate entry in the GO menu).
This job is being done at the news server of Infoteka Ltd, Novosibirsk, Russia
(news.itfs.nsk.su).
Monthly routine (runs on the 7-th of next month):
- factorizes the inverted data on the region basis
- smooths some of the factorized data on the time parameter:
the number of ads. per day is set equal to the average
during the current and 6 preceeding days.
- archives processed raw data files.
The resulting are 5 files:
- # of ads. in each group per day (#groups columns, #days rows)
- # of ads. from each region per day (#regions columns, #days rows)
- monthly average of ads. in each group from each region
(#regions columns, #groups rows); this file is not used in plotting
- # of active hosts in each region per day (#regions columns, #days rows)
- # of active hosts in each region since the beginning of statistics
(#regions columns, #days rows); this data is not smoothed.
The visualization routune is implemented as a SHELL script which does
the following:
- builds command file and appropriate source data file for `gnuplot'
based on the request from www browser and files described above
- runs `gnuplot' and `ppmtogif' -- the convertor of `gnuplot' output
to GIF file.
Having executed the script, http daemon sends a link to the GIF file
back to the www browser.
GIFs aged >= 24 hours are deleted by `cron' command every midnight.
This Web page was launched with support of INFOTEKA (RELCOM
regional node in Novosibirsk, Russia),
Institute of Economics and Industrial Engineering
(Novosibirsk), Russian Fund for Humanities Research,
Fulbright Program and
Krannert School of
Management at Purdue
University
This Web page as information system is Copyright 1996 by Sergei Parinov and Victor Lyapunov. Any
part of this information system may be freely used for any
purpose. If it published or distributed in part, it must include
this copyright notice. It may not be sold, or placed in something
for sale, without the permission of the authors.
Advertisements in group
contains the
quantity of communications and their change in time
within separate commodity groups. The structure of the
commodity groups is almost identical to the structure of
the commercial newsgroup sections that currently exists
in the RELCOM network. Individual communications are
assigned to the appropriate commodity groups on the basis
of the indication of the destination newsgroup contained
in the given communication. If a communication appears in
several newsgroups, only the first reference to the
destination newsgroup is used.
audio-video
chemicals
computers
construction
consume
energy
real estate
food
food.drinks
food.sweet
householding
info services
machinery
medicine
metals
money
orgtech
software
stocks
tobacco
transport
Advertisements in region
measures the
quantity of communications and their change in time
computed for individual regions where these
communications originated. The regional structure is
based on the established division of the territory of
Russia into 10 major economic regions, separate regions
of Moscow and St.-Petersburg, republics of the former
USSR, and the foreign sector.
Moscow
St.Petersburg
Ural
Center
West Siberia
Volga
Volga-Vyatka
East Siberia
Far East
North Caucaus
North-West
Black Earth Zone
Armenia
Azerbaijan
Belorussia
Georgia
Kazakhstan
Kirgizia
Latvia
Litva
Moldavia
Tajikistan
Turkmenia
Uzbekistan
Ukrain
Estonia
Active hosts in group
measures the number
of commercial organizations (or computers) which produce
messages to the commercial newsgroups of the RELCOM
network during the 24-hour period.
This indicator is computed for each
individual group. The index of the indicator's values
reflects the daily growth rate.
Active hosts in region
measures the number
of commercial organizations (or computers) which produce
messages to the commercial newsgroups of the RELCOM
network during the 24-hour period (one should bear in
mind that, technically, the same computer address may be
used by several computer sites in a network). This
indicator is computed for each 24-hour period and each
individual region. The index of the indicator's values
reflects the daily growth rate.
Sum of active hosts in region
measures the
total number of commercial entities that have sent at
least one message to the commercial newsgroups during the
period starting from February 1, 1995. Changes in the
value of this indicator over time allow to estimate the
rate of increase in the number of entrepreneurs who use
commercial newsgroups to disseminate their propositions
(it is necessary to take into account the fact that
changes in the indicator during the first months after
February, 1995, reflected the accumulation of required
information, therefore, this period must be excluded from
the analysis). The indicator is computed on a daily
basis, and for each region. The index of the indicator's
values is equal to the daily growth rate.
Advertisements per host in group
measures
the mean quantity of communications sent by a commercial
organization. This indicator allows to evaluate shifts in
the information activity of individual entrepreneurs (it
must be remembered that several entrepreneurs can use one
computer to communicate their messages). The indicator is
computed on a daily and weekly basis,
and for each group. The index
of the values of the indicator is not computed.
Advertisements per host in region
measures
the mean quantity of communications sent by a commercial
organization. This indicator allows to evaluate shifts in
the information activity of individual entrepreneurs (it
must be remembered that several entrepreneurs can use one
computer to communicate their messages). The indicator is
computed on a daily basis, and for each region. The index
of the values of the indicator is not computed.
Daily total of advertisements
measures
the total quantity of communications sent by all
organizations. The indicator is
computed on a daily and weekly basis. The index
of the values of the indicator is not computed.
Avg. traveling time (days)
measures the
average time that it takes for a message to get from the
place of origin to the registration point (see Section
Technical Guide). Changes in this indicator characterize
the overall quality of the message transmission by the
network.. The indicator is calculated on the daily basis.
The index of the values of the indicator is not computed.
Group hosts' activity distribution
presents the distribution, on the number of active days,
of hosts, which attended the specified group during a month.
On the plot, Y(X) is a share of hosts, which were "active"
1...X days. On the second page, the first numeric column
is the average of active days, the second numeric column
is the average divergence.
This indicator is being evaluated only for one-month
time ranges; the query with wider time range results in
averaging the indicators of the monthly sub-ranges.
Both daily/weekly and volume/index modes are not applicable
to this indicator and are ignored.
Regional hosts' activity distribution
presents the distribution, on the number of active days,
of hosts from the specified region, which were active
during a month.
On the plot, Y(X) is a share of hosts, which were "active"
1...X days. On the second page, the first numeric column
is the average of active days, the second numeric column
is the average divergence.
This indicator is being evaluated only for one-month
time ranges; the query with wider time range results in
averaging the indicators of the monthly sub-ranges.
Both daily/weekly and volume/index modes are not applicable
to this indicator and are ignored.
Regional hosts' commodity heterogenity
presents the distribution, on the number of attended
commodity groups,
of hosts from the specified region, which were active
during a month.
On the plot, Y(X) is a share of hosts, which attended
1...X groups. On the second page, the first numeric column
is the average of number of groups, the second numeric column
is the average divergence.
This indicator is being evaluated only for one-month
time ranges; the query with wider time range results in
averaging the indicators of the monthly sub-ranges.
Both daily/weekly and volume/index modes are not applicable
to this indicator and are ignored.
Group hosts' commodity heterogenity
presents the distribution, on the number of attended
commodity groups,
of hosts, which attended the specified group during a month.
On the plot, Y(X) is a share of hosts, which attended
1...X groups. On the second page, the first numeric column
is the average of number of groups, the second numeric column
is the average divergence.
This indicator is being evaluated only for one-month
time ranges; the query with wider time range results in
averaging the indicators of the monthly sub-ranges.
Both daily/weekly and volume/index modes are not applicable
to this indicator and are ignored.
USER GUIDE TO SECOND PAGE
Column NAME
Column #ADS
Column LMI
-- Last Month's Index
This is a preliminary draft only!
The data collection
A daily routine
scans the new usenet news in the relcom.commerce hierarchy.
The result is a text file containing items consisting (roughly)
of the fields `Date:', `From:' and `Newsgroups:'.
The approximate size of the file is 200 Kb.
- `inverts' the raw data files to produce the files containing info
about articles (advertisments) issued on the same day;
GRATITUDES
COPYRIGHT NOTICE