CORPORATE DATABASE
Designation - Industry - Company
Function - SEARCH-PARAMETER - Product
(Below the diagram for CANDIDATE
DATABASE):
- Edu. Quali
- Age
- Exp.
- Product/"Service" Exposure/knowledge
skills / city
(Below the diagram for CORPORATE
DATABASE):
- Tech. collabora
- Top Executives
- JV
- city
- Plants / Sales Office
Anjaria
10/7/99
On our website homepage, there is
a feature called
Triangle of
Industry-Company-Product
This database contains some 4000+
CII Member data received by us on a floppy.
For each of these Companies,
database provides
- What "products" they are
manufacturing
- What "Industry" do they belong to.
Using this database, a surfer on
our site can
- enter "Company Name" / Get "Products"
manufactured by the Company
- enter "Product Name" / Get list of
"Companies" manufacturing that product
- enter "Industry Name" / Get list
of "Companies" Operating in that Industry.
This is very Convenient.
Now, we are about to upload (into
this website-based database) another 32000 records compiled from KOMPASS-1993
directory. May be there will be some duplication (with CII data). And some
records may be obsolete because it is 6 yr old data. Even then it is quite
useful.
I believe, one feature of the
database allows you to Click on the name of the Company and get to see
its "Contact Address / Phone No." etc.
For MODULE 1, we need to
create a MASTER-FILE of
Company Name Industry Name
(Matching)
Why?
Because, when a Candidate says
(writes)
- "I am working in "Company ABC"
- or
- "I worked in "Company XYZ",
the software figures out (from MASTER
MATCHING LIST), that
- Company ABC belongs to "Industry L"
- Company XYZ " " "M".
We have, perhaps, managed to
establish this link for some 15000 + Companies.
Ultimately we must establish this
link for 500,000 companies (LTD & PRVT. LTD). A daunting task!
Perhaps there is a "short-cut".
Look at the triangle once again
Every child has to have a Mother
- so every product has to belong to an Industry. No orphans please!
Now if one link of the triangle
is missing, it would look as follows:
But
If we have, created a MASTER
whereby
- every CHILD'S MOTHER is identified
i.e.
- every PRODUCT'S INDUSTRY-NAME is established
then the matter becomes quite SIMPLE
(although somewhat circuitous!)
The moment we enter
- a Company's Name \& the "Products" which it
manufactures (or "Services" it sells),
we can travel "anti-clockwise"
along the sides of this triangle and figure out
- To which INDUSTRY does this COMPANY
belong to!
NO need to create "COMPANY
VS INDUSTRY" Master!!
Let that remain the "Missing
Link".
We travel,
Company Product Industry Name
Of course, if a Company
manufactures several "products", it may simultaneously belong
to several "Industries". No problem. we should enter all those
Industry-Names against the Name of that Company.
Whereas a COMPANY may
enter/exit a Product (Service), the relationship between the Product \& Industry will always remain SAME (mother
\& child).
How soon can we build-up an INDUSTRY
vs PRODUCT MASTER?
Regards
H. E. P.
cc: Nirav/Cyril
cc: CMT/Nishit
cc: Sajida
Anjaria :
9-7-99
Your task is to ensure that every
Resume in our database (on LAN or on INTERNET) is correctly
"coded" for
- Industry (Category) Name
- Function ( " ") Name
- Designation ( " ") Name
For each of the above, I have
created MASTERS (MASTER LISTS) which you should thoroughly study before
you start work.
Some small MASTER-LISTS
did exist when we started HH3P database created by Mr. Nagle, but I
explained to you how the present problem arose because, the data entry
operators, whenever they could not find appropriate MASTER, simply left
the
Industry / Function /
Designation FIELDS
blank \& proceeded with the remaining data-entry!
This is why our current
Internet-based database of 51000+ resumes contains (may be)
- Some 20000+ resumes without "Industry"
- Some (Same/other) 20000 resumes without "Function"
- Similar No. of resumes without "Designation".
There can be a lot of Overlap
in these numbers, because a given resume may have
- Only "Industry" missing
- " "Function" "
- " "Designation" "
- any permutation/combination of above.
When CYRIL designed MODULE
#1 (Data-Capture \& Query Software), we vastly enlarged the MASTER-LISTS
(You must see these first).
Even now, these LISTS may
not be exhaustive enough to take care of 95\% of the resumes.
This fact (of incompleteness) is
borne out by the fact that, while, recently entering 5000 web-forms
(resumes received over web), the Operators faced some problems, or
rather, problems with respect to some of the web-forms. These are listed \& the list is with Sajida.
So,
The first job is to solve
these problems so that these 5000 web-forms
get properly transferred to Module
#1.
Sajida can open each of
these Resumes (on the screen), tell you what is the problem, you give Correct
answer (by reading the resume on the screen) and Sajida Complete the
entry.
When this is done, you can turn
your attention to
ENLARGING THE MASTER-LISTS
Wherever data-entry operators are
facing maximum problems.
I have an intuitive feeling that
the
- Designation Master List \&
- Function Master List
are fairly COMPREHENSIVE
and may not need much enlargement/addition. Most likely, it will be the
COMPANY $\leftrightarrow$
INDUSTRY NAME MASTER
which may be posing problem.
This is obvious because, even if
we
(Date: 17-11-01)
Now realize that a drop-down list
of "Company-Names" is simply not possible!
take LTD \& PRVT. LTD. Companies of India, the list
could run into 500,000!
And our Candidate-member
(Who has sent his resume by typing \& e-mail) could be working NOW \& could have worked in past, in any
of these Companies!
So we need to know
- to what "INDUSTRY-CATEGORY" do
each of these 500,000 companies belong to?
Once, we have found these answers
- and created
COMPANY NAME $\leftrightarrow$
INDUSTRY NAME MASTER LIST
for these 500,000 Companies and entered the same in MODULE #1,
we would have solved 99.99\% of
the operator's problems.
The moment he highlights \& clicks on a COMPANY-NAME, appearing in
a Candidate's resume, against his
- Current Employment or
- Past Employment,
Module 1 will
automatically pick-up \& enter the correct INDUSTRY-NAME from
the MASTER-LIST.
Creating/enlarging MASTER
is a ONE-TIME job and your focus must be that.
Focus has to be on PREVENTION
(of an data entry error) rather than CURE!
Of course, one Company may be
active in several industries, simultaneously. In such a case, you
should, in the MASTER list all such "Industries"
against that company's name. This can be done by simply adding a "comma"
between the names of these industries. Sajida will show you how this is done.
As far as
Structured Web-form \& Floppy
is concerned, the candidate
himself selects
- Industry
- Function
- which are most "relevant" to
himself \& enters into appropriate "FIELDS".
so we have nothing to do/worry! It is
his funeral if he makes wrong
choices!
The problem arises only in "typed"
or "e-mailed" resumes, where there is no structured
field. This is why we are discouraging this method of submission of resumes.
For covering as many COMPANIES
(LTD \& PRVT. LTD) into your INDUSTRY MASTER
you may wish to discuss \& find a solution (of MERGING into one,
single, master-list of Companies) with help from Sajida, Soma,
Nirav, Cyril etc.
from the following Sources
- 16494 Companies (LTD) - List compiled by
Soma
- 32000 " (LTD \& PRVT. LTD) " from KOMPASS 93
- 400,000 " in EXPLORE INDIA CD
- Several printed directories
- CII Membership Floppy (already with us)
- ASSOCHAM " List (to be obtained)
- 60,000 Cos.
- Normal Adut. database
- Corpo. "
- Several other databases on our harddisk
- Several India-related Search Engines/websites on
Internet (I have a list).
Although, in itself, such an
objective of Creating a MEGA MASTER-LIST of COMPANIES Vs. INDUSTRIES
would be highly desirable (from the "prevention of error"
angle), the question that we must answer is
- how long will it take you to create such a MEGA
MASTER?
If it takes one month-or more-
it may not be worth to attempt do create such a list in ONE-GO!
In such a case, it may be much
better to take-up the list of (say) 20000 resumes where INDUSTRY-NAME is
missing, take up one at a time, and go on adding to INDUSTRY MASTER.
Such an approach has the
advantage of
- solving our immediate problem (of making that
resume COMPLETE \& therefore SEARCHABLE)
- gradually/simultaneously enlarging the MASTER.
I feel you should take this
approach but any case, discuss your strategy with CMT / consultants / Sajida
/ Nirav first.
9/3/99
cc: CMT
cc: Sajida
cc: Nirav / Cyril
YOGESH
31-07-98
Tasheem If in doubt (as to
what I mean), please consult me. This concerns you.
MODULE #1
DATA CAPTURE \& SEARCH QUERY
While designing/implementing
this, please ensure to incorporate
- my handwritten comments on your typed note dt. 28/04/98
- my handwritten note dt. 30/04/98
- my handwritten note dt. 15/07/98
- our several discussions (including when you showed
me some of the data-capture screens on 20/04/98)
On P-12 of my note dt. 30/04/98,
I have mentioned the need to "capture" the "Source".
I hope you have made that provision. This feature especially becomes crucial in
case of RESUME BUILDER FLOPPY, as you can see from enclosed chart. I
suppose, you will provide for "automatic" transfer of relevant
SERIAL NO. (M/P/A/D/O) while "automatic" allotment of PEN
when floppy gets loaded onto the database.
Before you give the demo on SUNDAY,
let someone check-out all points mentioned in all of my notes, to ensure that
these have been incorporated.
Regards
H. E. Parul (H. E. P.)
31-07-98
RESUME BUILDER FLOPPY
SOURCES FROM WHICH RECEIVED
(RETURNED)
- "M" Series 'M' Serial No.
- "P" Series from P 'P' Serial
No.
- "D" Series from Distributor Distributor
Code No.
- For payment of award of Rs. 10/ per CD (Original
or clone)
- "A" Series from Associate Associate
Code No.
- eg: Mankodi - 999.
- For payment of associate's share of our Prof. Fees
in case of appt of Candidate.
- Pl. make provision for this on Floppy generating
Software on Tasneem's m/c
- "O" (other)
CYRIL / Sajida / Chetan / Nishit
/ H. C. P. (1)
YOGESH
15-07-98
MODULE 1
DATA CAPTURE \& SEARCH
I refer to our discussion in my
office yesterday when you explained how the data-capture process (under Module
1) will work with
- Internet \} both structured EDS as well as
- Extranet \}} \text{e-mailed resumes
- Resume Floppy
- Hard copy (Typed resumes).
As soon as we install Voice-Recognition
Interface/Software on our E-PABX, we would also be able to capture VOICE-RESUMES
which will be, essentially electronic files. I suppose these will be
treated as e-mails.
During our discussions, it was
also felt that there is a VITAL need to build-up a database of NON-MEMBERS.
We agreed that these persons
should not be allotted PEN - also that this database should be maintained
Separately and must not be mixed-up with MEMBER
database.
The main reason for deciding this
was that a person whose data we
capture today (partial data
of course), could very well become our MEMBER tomorrow by entering his
resume on Internet/Extranet.
So, if we allot him a PEN
today (as a NON-MEMBER), we may end-up allotting him another PEN
tomorrow (as a MEMBER)! In fact Internet-Extranet will do this even without
our knowledge!
To avoid this we decided that we
shall create a distinct database for NON-MEMBERS (without PEN).
But this database will still be SEARCHABLE.
This is essential because our Consultants could use this database for
contacting "prospective-potential" Candidates.
As suggested by you, I have
prepared a "tabulation" as enclosed. Although there could be
dozens of "SOURCES" for NON-MEMBER data, I have picked
some 14 different sources which were readily available in our office.
Against each source, I have
ticked $\checkmark$ the data which I found in that source. This does not mean
that each \& every "field" is available
for each \& every person in that particular source,
But
If a particular field (data)
occurs even in a few cases, we must make provision for that, otherwise we will
miss-out data on that person.
You may now, use the enclosed
sheet to create the necessary data-capture screen for NON-MEMBER DATABASE.
My own observations:
① Although
- current job
(working since)
- "
salary
- Total years
of experience
is to be found in only one
SOURCE, Viz. Employee details (Annual Reports),
this is a VERY VERY IMPORTANT
data from the viewpoint of
- Com.com (websites)
- headhunting by consultants
So, we must keep it.
② Again, Only one
SOURCE, Viz.: MEMBER DATA UPDATE FORM,
Contains "fields"
for
- Function
- Product/Service exposure
- Industry
- Specialization
- \& Achievements.
I have designed this form and
sent to 93 local chapters (Centres) of The Institution of Engineers,
with a request to distribute amongst their 65000+ members, who will (hopefully)
fill-in and directly return to 3P.
I have promised to enter this
data \& create a database and make it available to
individual Centre. This service is FREE.
If this experiment succeeds, I
propose to repeat it with other Professional Bodies such as
- NIPM (National Institute of Personnel
Managers) etc.
In which case, I would like to
standardize on this form. I am preparing it in block format fields } \text{ so that we could capture the fields directly.
So we should keep these fields as
well.
③ There are a number of SOURCES,
Where SOME or OTHER data
is missing/not available.
But, because we are trying to
create a generalised/universal data-capture screen, I suppose, we have no
choice except to make provision for all the fields listed on the enclosed
tabulation.
On the otherhand, we must keep
the database so flexible that we can add more "fields" in
future as new "sources" might dictate.
While on the subject, please
seriously consider a drag \& drop type data-capture facility for JOB-ADVERTISEMENTS,
Which, we are, in anycase
scanning/OCR'ing \& converting to text format. This is because a
typical job-advertisement is other side of the coin (of resume), with almost
identical
SEARCH-PARAMETERS.
Also because,
only a few months down the line,
when we start work on Module #3 PRO-ACTIVE MARKETING, we have already
provided for "Adut. based Query".
But, we must also plan for
PRO-ACTIVELY PROMOTING A
CANDIDATE (based on his resume).
Idea is that every new resume
received everyday in our database will, automatically be matched with the
JOBS AVAILABLE DATABASE
or even
PAST JOB-ADVT. HISTORY OF EACH \& EVERY CORPORATE
for the last 3 years, to see if
that Corporate had ever advertized in the past for such a Candidate,
and especially, whether any
Corporation had, in the last 3 years
REPEATEDLY ADVERTISED FOR SUCH
CANDIDATE.
If so, it is a proof that they
either need such a person NOW or will need him in FUTURE.
And We want to tell such an
advertiser that we have their MAN-FRIDAY in our database, whenever they
need!
All of these correspondence
should get done
- automatically (without human intervention)
- daily
- over email}/\text{fax
- without revealing the identity of the Candidate
or the identity of his present employer.
In last 7 months, we have already
Scanned}/\text{OCR}'\text{ed \& converted to text file, over 12000 Job-Advts
and this database is growing at
the rate of 1000/month.
For each of these, following
fields (Search parameters) are already entered
- Industry
- Function
- Designation
For last 4/5 months, we have also
started Keying-in
4. Name of Contact person (or Designation of
Person to whom to apply)
5. Company (Advertiser) Name
6. Company Address
7. Phone / Fax / e-mail
Entry of these Fields is
currently being done by "keying-in". This is a very
time-consuming \& unproductive method.
Considering that
- this database Creation is our "CORE"
activity}/\text{business process
- this activity will grow a dozen-times or even a
hundred-times in the months and years to come (as we cover all
magazines/newspapers in India \& abroad - which carry job advertisements),
we must immediately "re-engineer"
this process (BPR} \text{!).
We have therefore, no option but
to scan}/\text{OCR}/\text{text and data-capture thru DRAG \& DROP WIZARD, all the
JOB-ADVERTISEMENTS
as well.
I, therefore, earnestly request
you to incorporate this in MODULE 1 right now.
In fact, ADVERTISEMENT DATA
CAPTURE SCREEN should be so flexible/generalized, that tomorrow we can use
it for any other (type) of advts to capture essential data about
- "Who" is the advertiser
- "What" (product or service) is
being advertised
- "To whom" it is "targetted"
(Readership)
Advts which I can off-hand think
about are
- Univ/college admissions
- Coaching classes
- Training Institutions (esp. Computer Training)
- Tenders for Supply of goods
- " for Construction contracts
- " for Consultancy
- " for Maintenance
- Adut. for Sale of Consumer Goods/durables
- " Engineering Goods
- " Software.
- " Services (eg. Telephone)
- " } = \text{Hospital
- " } = \text{Advertising
When we met on 28/5/98 (to
discuss your typed proposal on MODULE #1 / WIZARDS / OVERALL STRUCTURES),
we agreed upon following SCHEDULE:
|
Serial No |
MODULE DESCRIPTION |
Earliest Start Date |
Time to Complete |
Likely Overflow |
|
1 |
Data Capture \& Search |
28/5/98 |
6 weeks (2 for design, 2 for
coding, 2 for testing) |
2 weeks. |
|
2 |
Order Execution \& All Features of CONTEXT CARTRIDGE |
30/7/98 |
8 weeks. |
2 weeks |
I would request you to quickly draw-up
a most realistic TIME-TABLE for the remaining modules and Send me
your proposal. In one of the modules (NO. 2 ?) you must incorporate CORPORATE
DATABASE creation as well.
With regards
H. E. P.
16/7/98
COMMENTS ON MODULE 1
30/4/98
"WIZARDS"
diagram (Data Processing)
- Manual "DB Entry" should be
replaced by "Entry thru ARDIS".
If ARDIS cannot be ready (as
I suspect) within the next 4 weeks, then, at the least, we should definitely
get the Context Cartridge to generate/extract 16 "themes" from
each OCR'd resume and put these in respective database fields.
We MUST eliminate manual (Keyboard)
entry of DB fields.
- A common "PEN Allotment Software"
must
- ensure
avoidance of duplicate Numbers being allotted (we should simply assume
that each candidate has earlier sent to us his typed resume and that each
of them may have been allotted a PEN already)
- A kind of
"default" condition.
- ensure that
the Correct/appropriate "Serial" is used, depending upon
the "Source" of EDS,
Viz.
- Internet - }
3 \text{ million Series
- Extranet - }
2 \text{ " "
- EDS on floppy - } 1 \text{ " "
- Hard Copy - } 0 \text{ " "
- Non Members - } 4 \text{ " "
Fields (Themes (as found in Hard
Copy of EDS))
(in the order of Importance)
- Industry
- Function
- Education
- Product/Service Exposure
- Languages
- City/Country
Preference
- Knowledge/Skills. (P-2 "My Search Profile" of
EDS)
- Birth
Date/Age
- Designation
- Experience
- Current
Company-Name
- Past
Employer Name
- Gross Annual
Salary
- Keywords
- Name of
Executive
- PEN
- Resume Date.
- Image Files
It would be highly desirable to
be able to Search Image Files by
- Name of Executives
- PEN
Within 2/3 years, harddisk (or
any other RANDOM ACCESS MEMORY) will be so huge, that, instead of storing image
files, on } \text{50/75 } \text{ CDs } \text{ (of } \text{650 MB each), we
would be storing all 50,000 } \text{ image-files on a single storage device,
which could be randomly accessed.
When this happens, it would be
highly desirable to be able to quickly "locate \& View" the image of any resume
in matter of seconds.
This would be extremely useful
when a Candidate calls over phone and says, "I am Mr ......."
or
"My PEN is ......."
At that moment each \& every consultant in our office should be
able to instantly view the image (of resume) \& add his remarks/annotations as he is
talking to the Candidate
over phone \&
- giving him
some "feedback"
- getting from
him some "feedback" (eg. interest)
- giving him
some "instructions" (eg. interview)
Every telephonic conversation
with each \& every Candidate gets "recorded in the image
file". This would enable all consultants to know precisely what
has transpired (a kind of JANAM-KUNDLI), so that there is no
duplicate/wasted effort.
Perhaps, instead of on image-file,
all of these "recording/annotations" could be made on txt
files (which, I suppose, will in anycase be on a Single hard-disk even now).
Time will soon come, when these
"Recordings/Annotations/Remarks" will be carried-out thru SPEECH
RECOGNITION SOFTWARE, as a Consultant is talking to a Candidate
on phone.
SPELL-CHECK
This is the slowest of the manual
processes (database entry by keying-in, is another).
This process must be automated.
OCR softwares cannot
"automate" spell-check, because OCR software has no
"knowledge/clue" of the "context" in which a particular
word is used/employed in a sentence.
But ARDIS will have this
"knowledge".
In fact, I have already categorised
over 15000 most commonly used words (in a resume).
Once our 50,000 resumes have been
scanned/OCR'ed, then we would perhaps have, with us, a database of over 5
Million english-language sentences.
It would be a Simple exercise
to find-and list,
- all the sentences in which, each of the above-mentioned
} \text{15000 words, has appeared.
So, let us say, each of these
words have appeared in 300 sentences.
And, we anyway know the correct
"spelling" of each of these 15000 words.
So next time, ARDIS comes
across (in any scanned/OCR'd resume), "any sentence" which
closely "resembles" one of the 300 sentences, we know, what
precise word, the writer has intended to use.
We establish the "context"
statistically.
Having done that, we tell the
software to "Correct" the spelling of the words
so we eliminate manual "spell-check".
Quality Control.
Can we have a "tag"
which will tell us the name of the "operator" who scanned/OCR'd/spell-checked
a particular resume?
This way, if there are any
mistakes of
- spell check
- Database entry,
we could instantly know whom to catch!
The very knowledge that each
resume is being so "tagged" will ensure that the operators are
extra careful not to make any mistake!
NON-MEMBERS
These are executives who have
NOT registered with us i.e. they have not given us their Resumes.
a. Perhaps they are so
old or so senior (e.g. Company Chairman/M.D./Owner etc) that they are never
likely to be looking for a job.
OR
b. They are not looking
for a job "currently" but, perhaps, would not mind looking at an
"opportunity" if presented.
There are hundreds of
thousands of executives falling under category (b).
We have bare-minimum
information about them. But we still wish to enter this min. information
- and their name - in our database, so that over a period of time, we
could keep track of their "movements" (from one company to
another) and be able to "headhunt" them if \& when need arises.
Therefore we decided to give them
4,000,000 PEN series.
Mr. XXX.
PEN: 4,000,050.
|
Designation |
Company |
Source of Info. |
Date. |
Remarks |
|
1 |
||||
|
2 |
||||
|
3 |
||||
|
4 |
Available Data
1.
2.
3.
What happens when such an
executive (non-member) registers with us at a future date, and becomes a
"member"?
I think we should continue with
the same PEN (maybe) with an **asterisk *** to indicate that he has now
become a "member" and therefore, we have his full resume
somewhere.
QUERY
- Industry
- Function
- Edu. Quali.
List for each of these could
contain 200 names.
It would be too tedious \& tiring for any Consultant/Headhunter to scroll
thru a long list to locate that "name" which is "most
appropriate" to the SEARCH-ON-HAND.
A compromise formula could be:
Consultant/Headhunter manually
types/enters a word, eg. Auto
Industry
Explore
Immediately he sees a drop-down
list as follows:
|
Industry |
|
|
Auto |
O |
|
Automobile |
|
|
Car |
|
|
Truck |
|
|
Vehicle |
|
|
2 Wheeler |
|
|
" |
|
|
" |
|
|
" |
|
|
Tractor |
|
|
Scooter |
|
|
Motor Cycle |
An executive could have mentioned
in his online}/\text{offline resume that he belongs to anyone of the
above-mentioned "industries".
So a Consultant/Headhunter can
"narrow-down" or "enlarge" his search, by
clicking on ONE OR MORE of the industry-names appearing in the drop-down
list, successively, till he finds his MAN-FRIDAY.
This way, he does not have to scroll
thru 200 industry-names before locating the "most appropriate"
name.
He starts by typing a broad/generic
name and then quickly zeroes in on the most appropriate name from the DROP
DOWN LIST.
This process could be "reversed"
for an
- Executive trying to enter his resume Online
OR
- data-entry operator trying to decide THE MOST
APPROPRIATE search parameter (Industry/Function/Edu. Quali.) for a
candidate.
DATA ENTRY / QUERY
We must enter the "Source"
as far as resumes received from "Associates" is concerned.
Please ensure a provision for this.
If an associate's candidate gets
placed, the associate is entitled to his share of our professional fees. So we
must know the "source" in such cases.
QUERY
As far as "Students/Fresh
Graduates" EDS on floppy only will be accepted from such
personal or extranet ds. Pl. see my exhaustive notes on the
SEARCH PARAMETERS FOR FRESH
GRADUATES.
These will have to be
incorporated in your query package.





































No comments:
Post a Comment