There are 6 different types of assets and resources for academic institutions. These are: Faculty, Research Center, Technology, Facility, Equipment and Lab. Separate database tables are maintained for each of these resources. Assets/Resources are known as "profile types" and are assigned unique id - profileTypeId. The assignment of profileTypeId is done in this way : FacultyProfile - 1, ResearchCenterProfile - 2, TechnologyProfile - 3, FacilityProfile - 4, EquipmentProfile - 5, LabProfile - 6. Each institution has a unique id which is called institution id; for example : UT Arlington has institution id 1, UT Dallas has institution id 9 etc. Each individual asset/resource, in other words, profile is associated with a web page in the respective profile system. This web page can be accessed by using a unique id "profile id". In the system, a unique id is generated by combining these three ids: institution id, profile type id and profile id. This ensures that any entity can be uniquely identified across the system by using the unique id. This id is stored in the "id" field of the Asset/Resource (Profile) table.
A record/entity for a profile table is generated by following some steps which are summarized below:
Fetch the asset specific profile web page from the respective profile system.
Parse the web page
Transform the parsed data to information by following a specific structure
Create the record for the designated profile table
Insert the record
The transformation of parsed data is done by following some rules which are described in the following tables.
Storage Structure of Asset "Faculty"
#
Data Element
Type
Size
Description
Required
Multivalued
Structure
Example
Note
1
id
varchar
45
unique id that can differentiate an entity throughout the system
Yes
No
institutionId.profileTypeId.profileId
"1.1.178"
2
profileId
int
10
unique id of profile web page
Yes
No
-
-
profile id from the url
3
profileTypeId
int
10
unique id of the type of profile
Ys
No
-
-
For six different types of profiles there will be six different unique id ; Faculty profile - 1 , Research center profile - 2, Technology profile - 3, Facility profile - 4, Equipment profile - 5, Lab profile - 6
4
institutionId
int
10
id of the respective academic institution
Yes
No
-
-
There are 9 academic institutions in the partnership; University of Texas at Arlington : 1, University of Texas - Pan American : 2, University of North Texas Health Science Center : 3, University of Texas at El Paso : 4, University of Texas at San Antonio : 5, University of Texas at Tylor : 6, UT Health Science Center - San Antonio : 7, University of North Texas : 8, University of Texas at Dallas : 9
5
name
text
-
name of the faculty
Yes
No
name of the faculty retrieved from the contact information section of the web page
Dr. Mohan J Kumar
6
data
text
-
complete html of the web page is crawled and stored in this field
Yes
No
-
-
7
contactName
varchar
255
name of the primary contact person of this asset
No
No
full name of the person retrieved from the web page
Dr. Mohan J Kumar
8
contactPhone
varchar
45
contact phone of the primary contact person of this asset
No
No
phone no retrieved from the web page
817-272-3610
9
contactEmail
varchar
255
email of the primary contact person of this asset
No
No
email retrieved from the web page
mkumar@uta.edu
10
address
varchar
255
physical address location of this asset
Yes
No
physical address retrieved from the web page
500 UTA Blvd
If address is not found in the web page, assign the default address name of the parent institution
11
city
varchar
45
city where the asset is located
Yes
No
city name retrieved from the web page
Arlington
If city name is not found in the web page, assign the default city name of the parent institution
12
state
varchar
45
state where the asset is located
Yes
No
state name retrieved from the web page
TX
If state name is not found in the web page, assign the default state name of the parent institution
13
zip
varchar
45
zip code where the asset is located
Yes
No
zip code retrieved from the web page
76019
If zip code is not found in the web page, assign the default zip code of the parent institution
14
country
varchar
45
country where the asset is located
Yes
No
currently only USA
USA
15
lastUpdated
datetime
date and time of last update done on this profile
Yes
No
yyyy-mm-dd hh:mi:ss (24 hr)
2012-08-20 12:41:13
This information is parsed from top right corner of the web page
Storage Structure of Asset "Research Center"
#
Data Element
Type
Size
Description
Required
Multivalued
Structure
Example
Note
1
id
varchar
45
unique id that can differentiate an entity throughout the system
Yes
No
institutionId.profileTypeId.profileId
"1.2.507"
2
profileId
int
10
unique id of profile web page
Yes
No
-
-
profile id from the url
3
profileTypeId
int
10
unique id of the type of profile
Yes
No
2
-
For six different types of profiles there will be six different unique id ; Faculty profile - 1 , Research center profile - 2, Technology profile - 3, Facility profile - 4, Equipment profile - 5, Lab profile - 6
4
institutionId
int
10
id of the respective academic institution
Yes
No
-
-
There are 9 academic institutions in the partnership; University of Texas at Arlington : 1, University of Texas - Pan American : 2, University of North Texas Health Science Center : 3, University of Texas at El Paso : 4, University of Texas at San Antonio : 5, University of Texas at Tylor : 6, UT Health Science Center - San Antonio : 7, University of North Texas : 8, University of Texas at Dallas : 9
5
name
text
-
name of the asset
Yes
No
name of the research center retrieved from the web page
Electronics MEMS & Nanoelectronics Systems Packaging Center
6
data
text
-
complete html of the web page is crawled and stored in this field
Yes
No
-
-
7
contactName
varchar
255
name of the primary contact person of this asset
No
No
full name of the person retrieved from the web page
Dr. Lewis, Frank L.
8
contactPhone
varchar
45
contact phone of the primary contact person of this asset
No
No
phone no retrieved from the web page
817- 272-5972
9
contactEmail
varchar
255
email of the primary contact person of this asset
No
No
email retrieved from the web page
lewis@uta.edu
10
address
varchar
255
physical address location of this asset
Yes
No
physical address retrieved from web page
701 South Nedderman Drive
If address is not found in the web page, assign the default address of the parent institution
11
city
varchar
45
city where the asset is located
Yes
No
city name retrieved from web page
Arlington
If city name is not found in the web page, assign the default city name of the parent institution
12
state
varchar
45
state where the asset is located
Yes
No
state name retrieved from web page
TX
If state name is not found in the web page, assign the default state name of the parent institution
13
zip
varchar
45
zip code where the asset is located
Yes
No
zip code retrieved from web page
76019
If zip code is not found in the web page, assign the default zip code of the parent institution
14
country
varchar
45
country where the asset is located
Yes
No
currently only USA
USA
15
lastUpdated
datetime
date and time of last update done on this profile
Yes
No
yyyy-mm-dd hh:mi:ss (24 hr)
2009-05-27 08:31:43
This information is parsed from top right corner of the web page
Storage Structure of Asset "Technology"
#
Data Element
Type
Size
Description
Required
Multivalued
Structure
Example
Note
1
id
varchar
45
unique id that can differentiate an entity throughout the system
Yes
No
institutionId.profileTypeId.profileId
"1.3.469"
2
profileId
int
10
unique id of profile web page
Yes
No
-
-
profile id from the url
3
profileTypeId
int
10
unique id of the type of profile
Yes
No
-
3
For six different types of profiles there will be six different unique id ; Faculty profile - 1 , Research center profile - 2, Technology profile - 3, Facility profile - 4, Equipment profile - 5, Lab profile - 6
4
institutionId
int
10
id of the respective academic institution
Yes
No
-
-
There are 9 academic institutions in the partnership; University of Texas at Arlington : 1, University of Texas - Pan American : 2, University of North Texas Health Science Center : 3, University of Texas at El Paso : 4, University of Texas at San Antonio : 5, University of Texas at Tylor : 6, UT Health Science Center - San Antonio : 7, University of North Texas : 8, University of Texas at Dallas : 9
5
name
text
-
name of the asset
Yes
No
name of the technology retrieved from the web page
Engine Oil Additive
6
data
text
-
complete html of the web page is crawled and stored in this field
Yes
No
-
-
7
contactName
varchar
255
name of the primary contact person of this asset
No
No
full name of the person retrieved from the web page
Natalia Toth
8
contactPhone
varchar
45
phone no of the primary contact for this asset
No
No
phone no retrieved from the web page
(915) 747-7007
9
contactEmail
varchar
255
email address of the primary contact for this asset
No
No
email retrieved from the web page
savena@utep.edu
10
address
varchar
255
physical address location of this asset
Yes
No
physical address retrieved from web page
500 West University Avenue
If address is not found in the web page, assign the default address of the parent institution
11
city
varchar
45
city where the asset is located
Yes
No
city name retrieved from web page
El Paso
If city name is not found in the web page, assign the default city name of the parent institution
12
state
varchar
45
state where the asset is located
Yes
No
state name retrieved from web page
TX
If state name is not found in the web page, assign the default state name of the parent institution
13
zip
varchar
45
zip code of the asset
Yes
No
zip code retrieved from the web page
79968
If zip code is not found in the web page, assign the default zip code of the parent institution
14
country
varchar
45
country where the asset is located
Yes
No
currently only USA
USA
15
lastUpdated
datetime
date & time of last updae done on this profile
Yes
No
yyyy-mm-dd hh:mi:ss (24 hr)
2007-05-04 11:10:00
This information is parsed from top right corner of the web page
Storage Structure of Asset "Facility"
#
Data Element
Type
Size
Description
Required
Multivalued
Structure
Example
Note
1
id
varchar
45
unique id that can differentiate an entity throughout the system
Yes
No
institutionId.profileTypeId.profileId
"1.4.591"
2
profileId
int
10
unique id of profile web page
Yes
No
-
-
profile id from the url
3
profileTypeId
int
10
unique id of the type of profile
Yes
No
-
4
For six different types of profiles there will be six different unique id ; Faculty profile - 1 , Research center profile - 2, Technology profile - 3, Facility profile - 4, Equipment profile - 5, Lab profile - 6
4
institutionId
int
10
id of the respective academic institution
Yes
No
-
-
There are 9 academic institutions in the partnership; University of Texas at Arlington : 1, University of Texas - Pan American : 2, University of North Texas Health Science Center : 3, University of Texas at El Paso : 4, University of Texas at San Antonio : 5, University of Texas at Tylor : 6, UT Health Science Center - San Antonio : 7, University of North Texas : 8, University of Texas at Dallas : 9
5
name
text
-
name of the asset
Yes
No
name of the facility retrieved from the web page
Characterization Center for Materials and Biology
6
data
text
-
complete html of the web page is crawled and stored in this field
Yes
No
-
-
7
contactName
varchar
255
name of the Primary contact person of this asset
No
No
full name of the person retrieved from the web page
Dr. Stephanou, Harry E
8
contactPhone
varchar
45
contact phone of the primary contact person of this asset
No
No
phone no retrieved from the web page
8172725
9
contactEmail
varchar
255
email of the primary contact person of this asset
No
No
email retrieved from the web page
hes@arri.uta.edu
10
address
varchar
255
physical address location of this asset
Yes
No
physical address retrieved from web page
701 South Nedderman Drive
If address is not found in the web page, assign the default address of the parent institution
11
city
varchar
45
city where the asset is located
Yes
No
city name retrieved from web page
Arlington
If city name is not found in the web page, assign the default city name of the parent institution
12
state
varchar
45
state where the asset is located
Yes
No
state name retrieved from web page
TX
If state name is not found in the web page, assign the default state name of the parent institution
13
zip
varchar
45
zip code of the asset
Yes
No
zip code retrieved from the web page
76019
If zip code is not found in the web page, assign the default zip code of the parent institution
14
country
varchar
45
country where the asset is located
Yes
No
currently only USA
USA
15
lastUpdated
datetime
date & time of last update done on this profile
Yes
No
yyyy-mm-dd hh:mi:ss (24 hr)
2007-05-04 11:10:00
This information is parsed from top right corner of the web page
Storage Structure of Asset "Equipment"
#
Data Element
Type
Size
Description
Required
Multivalued
Structure
Example
Note
1
id
varchar
45
unique id that can differentiate an entity throughout the system
Equipment might be listed inside research center/ lab profile under an equipment section. If there are 4 equipments in a section of research center profile, they will have sectionSerialNo like 1,2,3,4. In cases where an equipment has a separate profile web page, sectionSerialNo will not be present
2
profileId
int
10
unique id of profile web page
Yes
No
-
-
profile id from the url
3
profileTypeId
int
10
unique id of the type of profile
Yes
No
-
5
For six different types of profiles there will be six different unique id ; Faculty profile - 1 , Research center profile - 2, Technology profile - 3, Facility profile - 4, Equipment profile - 5, Lab profile - 6
4
institutionId
int
10
id of the respective institution
Yes
No
-
-
There are 9 academic institutions in the partnership; University of Texas at Arlington : 1, University of Texas - Pan American : 2, University of North Texas Health Science Center : 3, University of Texas at El Paso : 4, University of Texas at San Antonio : 5, University of Texas at Tylor : 6, UT Health Science Center - San Antonio : 7, University of North Texas : 8, University of Texas at Dallas : 9
5
name
text
-
name of the asset
Yes
No
name of the equipment retrieved from the web page
Condensation Test Section
6
data
text
-
complete html of the web page is crawled and stored in this field
Yes
No
-
-
7
contactName
varchar
255
name of the primary contact person of this asset
No
No
full name of the person retrieved from the web page
Dr. Rao, K R.
8
contactPhone
varchar
45
contact phone of the primary contact person of this asset
No
No
phone No retrieved from the web page
817-272-3478
9
contactEmail
varchar
255
email of the primary contact person of this asset
No
No
email retrieved from the web page
rao@uta.edu
10
address
varchar
255
physical address location of this asset
Yes
No
physical address retrieved from the web page
701 South Nedderman Drive
If address is not found in the web page, assign the default address of the parent institution
11
city
varchar
45
city where asset is located
Yes
No
city name retrieved from web page
Arlington
If city name is not found in the web page, assign the default city name of the parent institution
12
state
varchar
45
state where asset is located
Yes
No
state name retrieved from web page
TX
If state name is not found in the web page, assign the default state name of the parent institution
13
zip
varchar
45
zip code of the asset
Yes
No
zip code retrieved from web page
76019
If zip code is not found in the web page, assign the default zip code of the parent institution
14
country
varchar
45
country where the asset is located
Yes
No
currently only USA
USA
15
lastUpdated
datyetime
date & time of last update done on this profile
Yes
No
yyyy-mm-dd hh:mi:ss (24 hr)
2009-05-27 09:11:49
This information is parsed from top right corner of the web page
Storage Structure of Asset "Lab"
#
Data Element
Type
Size
Description
Required
Multivalued
Structure
Example
Note
1
id
varchar
45
unique id that can differentiate an entity throughout the system
Yes
No
institutionId.profileTypeId.profileId
"1.6.1913"
2
profileId
int
10
unique id of profile web page
Yes
No
-
-
profile id from the url
3
profileTypeId
int
10
unique id of the type of profile
Yes
No
-
6
For six different types of profiles there will be six different unique id ; Faculty profile - 1 , Research center profile - 2, Technology profile - 3, Facility profile - 4, Equipment profile - 5, Lab profile - 6
4
institutionId
int
10
id of the respective institution
Yes
No
-
-
There are 9 academic institutions in the partnership; University of Texas at Arlington : 1, University of Texas - Pan American : 2, University of North Texas Health Science Center : 3, University of Texas at El Paso : 4, University of Texas at San Antonio : 5, University of Texas at Tylor : 6, UT Health Science Center - San Antonio : 7, University of North Texas : 8, University of Texas at Dallas : 9
5
name
text
-
name of the profile
Yes
No
name of the laboratory or research groups from the web page
Center for Renewable Energy, Science & Technology
6
data
text
-
complete html of the web page is crawled and stored in this field
Yes
No
-
-
7
contactName
varchar
255
name of the primary contact person of this asset
No
No
full name of the person retrieved from the web page
Dr. Aswath, Pranesh
8
contactPhone
varchar
45
contact phone of the primary contact person of this asset
No
No
phone No retrieved from the web page
+1 817 272 7108
9
contactEmail
varchar
255
email of the primary contact person of this asset
No
No
email retrieved from the web page
aswath@uta.edu
10
address
varchar
255
physical address location of this asset
Yes
No
physical address retrieved from web page
500 West FIrst Street, Rm. 325
If address is not found in the web page, assign the default address of the parent institution
11
city
varchar
45
city where the asset is located
Yes
No
city name retrieved from web page
Arlington
If city name is not found in the web page, assign the default city name of the parent institution
12
state
varchar
45
state where the asset is located
Yes
No
state name retrieved from web page
TX
If state name is not found in the web page, assign the default state name of the parent institution
13
zip
varchar
45
zip code of the asset
Yes
No
zip code retrieve from web page
76019
If zip code is not found in the web page, assign the default zip code of the parent institution
14
country
varchar
45
country where the asset is located
Yes
No
currently only USA
USA
15
lastUpdated
datetime
date & time of last update done on this profile
Yes
No
yyyy-mm-dd hh:mi:ss (24 hr)
2008-12-20 11:33:21
This information is parsed from top right corner of the web page
Can you please add XML schema examples for each of these? Also, use the descript...
Can you please add XML schema examples for each of these? Also, use the description field to describe and define the data element. In addition, you can insert a column with the heading "Multivalued (Yes (Y)/ No (N)) and use that instead of the n=1, 2 etc. Use the structure to simplly define what all can be included.
Here is an example (first row is the description of the columns and the second is an example).
#
Data Element
Description
Required?
Multi-valued?
Notes
Assign a serial number
list the element here
provide a lay man description of the element
Yes or No if it is required or not
Yes or No if it is multi-valued
List all possible values where there is a controlled vocabulary for this field
3
profileTypeId
a unique id assigned to the type of profile
Yes
No
1 for faculty, 2 for ....
You might also want to break down structure into its components where multiple components are in play. You can do that by adding an extra column for Parent Data Element. I do like how you have separated the various entities. but you need to include examples for each of them. Also, don't call your headings ... table structure because you may be saving the information in a table but technically the information can be stored in any storage structure and not necessarily only tables.
Otherwise, In terms of content, I think you have covered it all.
Comments (2)
Aug 23, 2012
rmittal says:
Can you please add XML schema examples for each of these? Also, use the descript...Can you please add XML schema examples for each of these? Also, use the description field to describe and define the data element. In addition, you can insert a column with the heading "Multivalued (Yes (Y)/ No (N)) and use that instead of the n=1, 2 etc. Use the structure to simplly define what all can be included.
Here is an example (first row is the description of the columns and the second is an example).
You might also want to break down structure into its components where multiple components are in play. You can do that by adding an extra column for Parent Data Element. I do like how you have separated the various entities. but you need to include examples for each of them. Also, don't call your headings ... table structure because you may be saving the information in a table but technically the information can be stored in any storage structure and not necessarily only tables.
Otherwise, In terms of content, I think you have covered it all.
We need the same thing for the need as well.
Aug 29, 2012
Md Abdus Salam says:
I have updated the document according to your guideline.I have updated the document according to your guideline.