Any given word (a cluster of character) can be
classified (in English) into one of the following "Categories:-
WORD / Verb/
Adverb/ Preposition / Adjective / Noun /Common Noun/
Proper Noun
So the first task is to create a
"directory" of each of this category. Then each "word" must
be compared to the words contained in
given directory. If a match occurs then that WORD would get categorized as
belonging to that category. The process has to be repeated again and again by trying
to match the word with the words contained in each of the categories TILL a
match is found. If no "match" is found, that word should be
separately stored in a file marked.
"UNMATCHED WORDS" Everyday, an expert would study all the
words contained in this file and assign each of these words a definite
category, using his "HUMAN INTELLIGENCE" In this way, over a period
of time, the human intelligence will identify/ categories’ each and every word
contained in ENGLISH LANGUAGE. This will be the process of transferring human
intelligence to computer. Essentially the trick lies in getting the computer
(Software) to MIMIC the process followed by a human brain while scanning a set
of words (i.e. reading) and by analyzing the "Sequence" in which
these words are arranged, to assign a MEANING to each word or a string of words
(a phrase or a sentence). I cannot believe that no one has attempted this
before (especially since it has so much commercial value). We don't know
who has developed this software and where to find it so we must end - up
rediscovering the wheel ! Our computer files contain some 900,000 words which
have repeatedly occurred in our records - mostly coveted bio - data’s or words
captured from bio - dates. We have, in our files, some 3500 Converted bio - data’s.
It has taken us about 6 years to accomplish this feat
i.e. Approx
600 converted biodatas / years OR
Approx 2 biodatas converted every
working day !
Assuming that all those (converted) bio data’s
which are older than 2 years are OBSOLUTE, this means that perhaps no more than
1200 are current / valid / useful !
So, one thing becomes clear The "rate of
Obsolescence" is faster than the "rate of conversion" ! Of course, we can argue, "Why should we
waste / spend our time in "converting" a bio - data ? All we need to do
is to Capture the ESSENTIAL / MINIMUM
DATA (from each biodata_ which would qualify that person to get searched /
spotted. If he gets short listed, we can always, at that point of time, spend
time / effort to fully converted this bio - data .in fact this is what we have
done so far - because there was a premium on the time of data - entry
operators. That time was best utilized in capturing the essential / minimum
data. But if latest technology permits/ enables us to convert 200 biodatas each
day (instead of just 2 biodatas with the same effort/ time/ cost, then why not
convert 200? why be satisfied with just 2 day ? If this can be made to
"happen", we would be in a position to send - out / fax - out e :
mail, converted bio - data’s to our clients in matter of "minutes" instead of
"days" - which it takes today ! That is no all A converted bio - data has
for more KEYWORDS (Knowledge - skills - attributes - attitudes etc) than the
MINIMUM DATA. So there is an improved chance of spotting the RIGHT MAN, using a
QUERY which contains a large no. of KEYWORDS. So, to - day, if the clients
"likes" only ONE converted bio - data, out of TEN sent to him (a huge
waste of everybody's time/ effort), then
under the new situation he should be able to "like" 4 out of every 5
converted bio - data’s sent to him !
This would vastly improve the chance of at least
ONE executive getting appointed in such assignment. This should be our goal. This
goal could be achieved only if,
Step # 1.Each biodata received every day is
"scanned" on the same day
step # 2. Converted to TEXT (ASCII)
step # 3.
PEN given serially
step # 4.
WORD - RECOGNISED (a step beyond OCR - Optical - CHARACTER recognized)
step # 5.
Each word "categorized" and indexed and stored in appropriate FIELDS
of the DATABASE.
step # 6.
Database "reconstituted" to create "converted" biodata as
per our standard format
Step 1/ 2/ 3 are not difficult , Step 4 is
difficult, Step 5 is more difficult , Step 6 is most difficult But if we keep working on this problem, it can
be solved 50% accurate in 3 months , 70
% accurate in 6 months, 90% accurate in 12 months.
Even though
there are about 900,000 indexed WORDS in our ISYS file, all of these do not
occur (in a biodata/ record) with the same frequency. Some occur far more
frequently, some frequently some regularly, some occasionally and some rarely. Then
the course (in the English language) there must be thousands of other Words,
which Love not occurred EVEN ONCE in any of the biodatas. Therefore we won't
find them amongst the existing indexed file of 900,000 words. It is quite possible
that some of these (so far missing words( may occur if this file (of words)
were to grow to 2 million.
As this file
of words grows and grows, the probabilities of :-
· A
words having been left out and
· Such
a left - out likely to occur (in the next biodata) are "decreasing"
Meaning, Some
20% of the words (in English language) make - up may be 905 of all the "Occurrences".
This would become clear when we plot
the frequency distribution - curve of the 900,000 words which we have already indexed. And even when this population
grows to 2 million, the shape (the nature) of the frequency distribution curve
is NOT likely to change! only with a much large WORD - POPULATION, the
"accuracy" will marginally increase. So our search is to find, Which
are these 20% (20% X 9 Lakh = 180,000) Words which make - up 90% "area
under the curve" i.e. POPULATION? Then focus our efforts in "Categorizing"
these 180,000 words in the first place If we manage to do this, 90% of our
battle is won. Of course this pre - supposes that before we can attempt
"Categorization", we must be able to recognize each of them as a
"WORD" 6 yrs down the line (Since writing this note), I feel this no.
is no more than 30,000 words!
COMPANY
SIMILAR MEANING WORDS
Firm/
Corporation/ Organization/ Employer/ Industry (Misnomer)
ASSOCIATED
WORDS
Name of
(Company)/ Company (Profile) /Present/Current/Past /(Company) Products /
(Company) Structure/ (Company) Organization.
CAREER
Career Path/
Career History /Career Achievement/Career Growth/ Career Objective/ Career
Progression / Career Information/ Career
Details/ Career Development/ Career Goal/Career Interest/Career Nature/ Career Profile/ Career Record.
Associated
Words
Past/
Present / Professional/ Academic / Previous/ SIMILAR MEANING WORDS/ SERVICE
CURRICULAM
SIMILAR MEANING WORDS
Course /
Subjects/ Topics
RELATED WORDS
Academy/ Scholastic
/ Education/ research / Exam/scholarship/ Graduation/training/ Honors/teaching / Institution/ University/
College/ Degree/ Diploma / Certificate/ Learning / Pass /Passing / Year of passing
/ Project / Training/ Qualifications
DEPENDENTS
Associated
Words
Family/ Father / Mother / Brother/ Sister/ Wife /
Children/ Son/ Daughter
EDUCATION
Education
(al)/ Educational Qualifications/ Qualifications/ Academic Qualifications/
Technical Qualifications.
Associated
Words
Qualification / School/ Degree/ Diploma/university / Graduate/ Graduation/Institution/
Doctorate/ Certificate / Curricular/ Course/ Exam/ Topics/ Subjects/ Electives /
Under – Graduate/Fellow/ Honors/ Distinction / First Class/ Grade Point Average
(GPA)
EXPERIENCE
Employment
experience/Work experience / Job experience/ Professional experience/ Current experience/ Past experience/. Present
experience/ Relevant experience/ Industrial
/ Industry experience/ Teaching experience / Details of experience /Foreign
experience/ Factory experience/ Global experience/ Management experience / Site
experience/ Major experience / Practical experience/ Research experience/ Service experience/ Training
experience/ Technical experience
EMPLOYER
Company/
Firm /Organization/ Corporation
RELATED
WORDS
Present / Current/
Past/ Career/ Job/ Service/ Name of
EMPLOYMENT
Employment
Particular / Employment Past / Employment Present/ Employment Current/ Employment
Record/ Employment History / Employment Existing /. Employment Data/ Employment
Nature/ Employment Period
FUNCTION
Responsibility / Duty/Job/ Past / Management/ Present/Description/
Existing / Profile/ Current/ Skills (associated with) /Con – current/ Structure
(Functional) / Major / Organization (Functional) / Minor /Technical/ Nature of/
Reports to
FACTORY
Plant / Site/
Works /Manufacturing location
INFOMRATION
DATA / KNOWLEDGE / DATABASE/ DATA SHEET/ Processing/current
Collection /Past/ Retrieval/ Personal/ Analysis /job Related/ Category/ Work
Related/ Career/ Additional/ Details/ Institutional/ Compilation/ Particular/ Field
of/ General/ Industry (IT industry) /Nature of/ Purpose of/ Product / Project
related/ Organizational/ Service/ State of / Dissemination/
EXECUTIVE
Employee/ Worker / Work man/ Supervisor/ Officer/ Manager
/ Data sheet/ Profile/ Staff Company/ Workforce/ Responsibility Position/ Status/
Search /Skills/ Selection/title Placement/designation/ Interview/ Bio Data /Execute/
Exposure Resume /Post/ Salary /Compensation/ Training /Experience
h.c.parekh