View Source

h1. Data Standards for Assets and Resources

There are 6 different types of assets and resources for academic institutions. These are: Faculty, Research Center, Technology, Facility, Equipment and Lab. Separate database tables are maintained for each of these resources. Assets/Resources are known as "profile types" and are assigned unique id - profileTypeId. The assignment of profileTypeId is done in this way : FacultyProfile - 1, ResearchCenterProfile - 2, TechnologyProfile - 3, FacilityProfile - 4, EquipmentProfile - 5, LabProfile - 6. Each institution has a unique id which is called institution id; for example : UT Arlington has institution id 1, UT Dallas has institution id 9 etc. Each individual asset/resource, in other words, profile is associated with a web page in the respective profile system. This web page can be accessed by using a unique id "profile id". In the system, a unique id is generated by combining these three ids: institution id, profile type id and profile id. This ensures that any entity can be uniquely identified across the system by using the unique id. This id is stored in the "id" field of the Asset/Resource (Profile) table.

A record/entity for a profile table is generated by following some steps which are summarized below:
* Fetch the asset specific profile web page from the respective profile system.
* Parse the web page
* Transform the parsed data to information by following a specific structure
* Create the record for the designated profile table
* Insert the record

The transformation of parsed data is done by following some rules which are described in the following tables.


h3. Storage Structure of Asset "Faculty"

|| \# \\ || Data Element \\ || Type \\ || Size \\ || Description || Required \\ || Multivalued \\ || Structure \\ || Example \\ || Note \\ ||
| 1 \\ | id \\ | varchar \\ | 45 \\ | unique id that can differentiate an entity throughout the system | Yes \\ | No \\ | institutionId.profileTypeId.profileId | "1.1.178" \\ | |
| 2 \\ | profileId \\ | int \\ | 10 \\ | unique id of profile web page | Yes \\ | No \\ | \- \\ | \- \\ | profile id from the url |
| 3 \\ | profileTypeId \\ | int \\ | 10 \\ | unique id of the type of profile | Ys \\ | No \\ | \- \\ | \- \\ | For six different types of profiles there will be six different unique id ; Faculty profile - 1 , Research center profile - 2, Technology profile - 3, Facility profile - 4, Equipment profile - 5, Lab profile - 6 |
| 4 \\ | institutionId \\ | int \\ | 10 \\ | id of the respective academic institution \\ | Yes \\ | No \\ | \- \\ | \- \\ | There are 9 academic institutions in the partnership; University of Texas at Arlington : 1, University of Texas - Pan American : 2, University of North Texas Health Science Center : 3, University of Texas at El Paso : 4, University of Texas at San Antonio : 5, University of Texas at Tylor : 6, UT Health Science Center - San Antonio : 7, University of North Texas : 8, University of Texas at Dallas : 9 |
| 5 \\ | name \\ | text \\ | \- \\ | name of the faculty \\ | Yes | No \\ | name of the faculty retrieved from the contact information section of the web page \\ | Dr. Mohan J Kumar | \\
\\
\\ |
| 6 \\ | data \\ | text \\ | \- \\ | complete html of the web page is crawled and stored in this field \\ | Yes \\ | No \\ | \- \\ | \- \\ | |
| 7 \\ | contactName \\ | varchar \\ | 255 \\ | name of the primary contact person of this asset \\ | No | No \\ | full name of the person  retrieved from the web page | Dr. Mohan J Kumar | |
| 8 \\ | contactPhone \\ | varchar | 45 \\ | contact phone of the primary contact person of this asset \\ | No | No | phone no retrieved from the web page \\ | 817-272-3610 | |
| 9 \\ | contactEmail \\ | varchar | 255 \\ | email of the primary contact person of this asset \\ | No | No | email retrieved from the web page \\ | mkumar@uta.edu | |
| 10 \\ | address \\ | varchar | 255 \\ | physical address location of this asset \\ | Yes | No | physical address retrieved from the web page \\ | 500 UTA Blvd \\ | If address is not found in the web page, assign the default address name of the parent institution |
| 11 \\ | city \\ | varchar | 45 \\ | city where the asset is located  \\ | Yes | No | city name retrieved from the web page \\ | Arlington | If city name is not found in the web page, assign the default city name of the parent institution |
| 12 \\ | state \\ | varchar | 45 \\ | state where the asset is located \\ | Yes | No | state name retrieved from the web page \\ | TX | If state name is not found in the web page, assign the default state name of the parent institution |
| 13 \\ | zip \\ | varchar | 45 \\ | zip code where the asset is located \\ | Yes | No | zip code retrieved from the web page \\ | 76019 | If zip code is not found in the web page, assign the default zip code of the parent institution |
| 14 \\ | country \\ | varchar | 45 \\ | country where the asset is located \\ | Yes | No | currently only USA \\ | USA | |
| 15 \\ | lastUpdated \\ | datetime \\ | | date and time of last update done on this profile \\ | Yes | No | yyyy-mm-dd hh:mi:ss (24 hr) | 2012-08-20 12:41:13 | This information is parsed from top right corner of the web page |

h3. Storage Structure of Asset "Research Center"

|| \# \\ || Data Element \\ || Type \\ || Size \\ || Description \\ || Required \\ || Multivalued \\ || Structure \\ || Example \\ || Note \\ ||
| 1 \\ | id \\ | varchar \\ | 45 \\ | unique id that can differentiate an entity throughout the system | Yes | No \\ | institutionId.profileTypeId.profileId \\ | "1.2.507" | |
| 2 \\ | profileId \\ | int \\ | 10 \\ | unique id of profile web page | Yes | No \\ | \- | \- \\ | profile id from the url |
| 3 \\ | profileTypeId \\ | int \\ | 10 \\ | unique id of the type of profile | Yes | No \\ | 2 | \- \\ | For six different types of profiles there will be six different unique id ; Faculty profile - 1 , Research center profile - 2, Technology profile - 3, Facility profile - 4, Equipment profile - 5, Lab profile - 6 |
| 4 \\ | institutionId \\ | int \\ | 10 \\ | id of the respective academic institution \\ | Yes | No | \- | \- \\ | There are 9 academic institutions in the partnership; University of Texas at Arlington : 1, University of Texas - Pan American : 2, University of North Texas Health Science Center : 3, University of Texas at El Paso : 4, University of Texas at San Antonio : 5, University of Texas at Tylor : 6, UT Health Science Center - San Antonio : 7, University of North Texas : 8, University of Texas at Dallas : 9 |
| 5 \\ | name \\ | text \\ | \- \\ | name of the asset \\ | Yes | No \\ | name of the research center retrieved from the web page \\ | Electronics MEMS & Nanoelectronics Systems Packaging Center | |
| 6 \\ | data \\ | text \\ | \- \\ | complete html of the web page is crawled and stored in this field | Yes \\ | No \\ | \- \\ | \- \\ | |
| 7 \\ | contactName \\ | varchar \\ | 255 \\ | name of the primary contact person of this asset | No | No | full name of the person  retrieved from the web page | Dr. Lewis, Frank L. | |
| 8 \\ | contactPhone \\ | varchar | 45 | contact phone of the primary contact person of this asset | No | No | phone no retrieved from the web page | 817\- 272-5972 | |
| 9 \\ | contactEmail \\ | varchar | 255 | email of the primary contact person of this asset | No | No | email retrieved from the web page | lewis@uta.edu | |
| 10 \\ | address \\ | varchar | 255 | physical address location of this asset \\ | Yes | No | physical address retrieved from web page | 701 South Nedderman Drive | If address is not found in the web page, assign the default address of the parent institution |
| 11 \\ | city \\ | varchar | 45 | city where the asset is located | Yes | No | city name retrieved from web page | Arlington | If city name is not found in the web page, assign the default city name of the parent institution |
| 12 \\ | state \\ | varchar | 45 | state where the asset is located | Yes | No | state name retrieved from web page | TX | If state name is not found in the web page, assign the default state name of the parent institution |
| 13 \\ | zip \\ | varchar | 45 | zip code where the asset is located | Yes | No | zip code retrieved from web page | 76019 | If zip code is not found in the web page, assign the default zip code  of the parent institution |
| 14 \\ | country \\ | varchar | 45 | country where the asset is located | Yes | No | currently only USA | USA | |
| 15 \\ | lastUpdated \\ | datetime | | date and time of last update done on this profile \\ | Yes | No | yyyy-mm-dd hh:mi:ss (24 hr) | 2009-05-27 08:31:43  | This information is parsed from top right corner of the web page |

h3. Storage Structure of Asset "Technology"

|| \# \\ || Data Element \\ || Type \\ || Size \\ || Description \\ || Required \\ || Multivalued \\ || Structure \\ || Example \\ || Note \\ ||
| 1 \\ | id \\ | varchar \\ | 45 \\ | unique id that can differentiate an entity throughout the system | Yes \\ | No \\ | institutionId.profileTypeId.profileId | "1.3.469" \\ | |
| 2 \\ | profileId \\ | int \\ | 10 \\ | unique id of profile web page | Yes \\ | No \\ | \- \\ | \- \\ | profile id from the url \\ |
| 3 \\ | profileTypeId \\ | int \\ | 10 \\ | unique id of the type of profile | Yes \\ | No \\ | \- \\ | 3 \\ | For six different types of profiles there will be six different unique id ; Faculty profile - 1 , Research center profile - 2, Technology profile - 3, Facility profile - 4, Equipment profile - 5, Lab profile - 6 |
| 4 \\ | institutionId \\ | int \\ | 10 \\ | id of the respective academic institution \\ | Yes \\ | No \\ | \- \\ | \- \\ | There are 9 academic institutions in the partnership; University of Texas at Arlington : 1, University of Texas - Pan American : 2, University of North Texas Health Science Center : 3, University of Texas at El Paso : 4, University of Texas at San Antonio : 5, University of Texas at Tylor : 6, UT Health Science Center - San Antonio : 7, University of North Texas : 8, University of Texas at Dallas : 9 |
| 5 \\ | name \\ | text \\ | \- \\ | name of the asset \\ | Yes \\ | No \\ | name of the technology retrieved from the web page \\ | Engine Oil Additive | |
| 6 \\ | data \\ | text \\ | \- \\ | complete html of the web page is crawled and stored in this field | Yes \\ | No \\ | \- \\ | \- \\ | |
| 7 \\ | contactName \\ | varchar \\ | 255 \\ | name of the primary contact person of this asset \\ | No \\ | No \\ | full name of the person  retrieved from the web page | Natalia Toth \\ | |
| 8 \\ | contactPhone \\ | varchar \\ | 45 \\ | phone no of the primary contact for this asset \\ | No \\ | No \\ | phone no retrieved from the web page \\ | (915) 747-7007 | |
| 9 \\ | contactEmail \\ | varchar \\ | 255 \\ | email address of the primary contact for this asset \\ | No \\ | No \\ | email retrieved from the web page | savena@utep.edu \\ | |
| 10 \\ | address \\ | varchar \\ | 255 \\ | physical address location of this asset \\ | Yes \\ | No \\ | physical address retrieved from web page | 500 West University Avenue | If address is not found in the web page, assign the default address of the parent institution |
| 11 \\ | city \\ | varchar \\ | 45 \\ | city where the asset is located | Yes \\ | No \\ | city name retrieved from web page | El Paso \\ | If city name is not found in the web page, assign the default city name of the parent institution |
| 12 \\ | state \\ | varchar \\ | 45 \\ | state where the asset is located | Yes \\ | No \\ | state name retrieved from web page | TX \\ | If state name is not found in the web page, assign the default state name of the parent institution |
| 13 \\ | zip \\ | varchar \\ | 45 \\ | zip code of the asset | Yes \\ | No \\ | zip code retrieved from the web page | 79968 \\ | If zip code  is not found in the web page, assign the default zip code of the parent institution |
| 14 \\ | country \\ | varchar \\ | 45 \\ | country where the asset is located | Yes \\ | No \\ | currently only USA | USA \\ | |
| 15 \\ | lastUpdated \\ | datetime | | date & time of last updae done on this profile \\ | Yes \\ | No \\ | yyyy-mm-dd hh:mi:ss (24 hr) | 2007-05-04 11:10:00 | This information is parsed from top right corner of the web page |

h3. Storage Structure of Asset "Facility"

|| \# \\ || Data Element \\ || Type \\ || Size \\ || Description \\ || Required \\ || Multivalued \\ || Structure \\ || Example \\ || Note \\ ||
| 1 \\ | id \\ | varchar \\ | 45 \\ | unique id that can differentiate an entity throughout the system | Yes \\ | No \\ | institutionId.profileTypeId.profileId \\ | "1.4.591" \\ | |
| 2 \\ | profileId \\ | int \\ | 10 \\ | unique id of profile web page | Yes \\ | No \\ | \- \\ | \- \\ | profile id from the url |
| 3 \\ | profileTypeId \\ | int \\ | 10 \\ | unique id of the type of profile \\ | Yes \\ | No \\ | \- \\ | 4 \\ | For six different types of profiles there will be six different unique id ; Faculty profile - 1 , Research center profile - 2, Technology profile - 3, Facility profile - 4, Equipment profile - 5, Lab profile - 6 |
| 4 \\ | institutionId \\ | int \\ | 10 \\ | id of the respective academic institution | Yes \\ | No \\ | \-  \\ | \- \\ | There are 9 academic institutions in the partnership; University of Texas at Arlington : 1, University of Texas - Pan American : 2, University of North Texas Health Science Center : 3, University of Texas at El Paso : 4, University of Texas at San Antonio : 5, University of Texas at Tylor : 6, UT Health Science Center - San Antonio : 7, University of North Texas : 8, University of Texas at Dallas : 9 \\ |
| 5 \\ | name \\ | text \\ | \- \\ | name of the asset \\ | Yes \\ | No \\ | name of the facility retrieved from the web page | Characterization Center for Materials and Biology | |
| 6 \\ | data \\ | text \\ | \- \\ | complete html of the web page is crawled and stored in this field | Yes \\ | No \\ | \- \\ | \- \\ | |
| 7 \\ | contactName \\ | varchar \\ | 255 \\ | name of the Primary contact person of this asset | No \\ | No \\ | full name of the person  retrieved from the web page | Dr. Stephanou, Harry E | |
| 8 \\ | contactPhone \\ | varchar \\ | 45 \\ | contact phone of the primary contact person of this asset | No \\ | No \\ | phone no retrieved from the web page | 8172725 | |
| 9 \\ | contactEmail \\ | varchar \\ | 255 \\ | email of the primary contact person of this asset | No \\ | No \\ | email retrieved from the web page | hes@arri.uta.edu | |
| 10 \\ | address \\ | varchar \\ | 255 \\ | physical address location of this asset | Yes \\ | No \\ | physical address retrieved from web page | 701 South Nedderman Drive | If address is not found in the web page, assign the default address of the parent institution |
| 11 \\ | city \\ | varchar \\ | 45 \\ | city where the asset is located | Yes \\ | No \\ | city name retrieved from web page | Arlington | If city name is not found in the web page, assign the default city name of the parent institution |
| 12 \\ | state \\ | varchar \\ | 45 \\ | state where the asset is located \\ | Yes \\ | No \\ | state name retrieved from web page | TX \\ | If state name is not found in the web page, assign the default state name of the parent institution |
| 13 \\ | zip \\ | varchar \\ | 45 \\ | zip code of the asset \\ | Yes \\ | No \\ | zip code retrieved from the web page \\ | 76019 \\ | If zip code  is not found in the web page, assign the default zip code of the parent institution |
| 14 \\ | country \\ | varchar \\ | 45 \\ | country where the asset is located \\ | Yes \\ | No \\ | currently only USA \\ | USA \\ | |
| 15 \\ | lastUpdated \\ | datetime \\ | | date & time of last update done on this profile | Yes \\ | No \\ | yyyy-mm-dd hh:mi:ss (24 hr) | 2007-05-04 11:10:00 | This information is parsed from top right corner of the web page |

h3. Storage Structure of Asset "Equipment"

|| \# \\ || Data Element \\ || Type \\ || Size \\ || Description \\ || Required \\ || Multivalued \\ || Structure \\ || Example \\ || Note \\ ||
| 1 | id \\ | varchar \\ | 45 \\ | unique id that can differentiate an entity throughout the system | Yes \\ | No \\ | institutionId.profileTypeId.profileId.sectionSerialNo \\ | "1.5.1750.1" \\
"1.5.1858" \\ | Equipment might be listed inside research center/ lab profile under an equipment section. If there are 4 equipments in a section of research center profile, they will have sectionSerialNo like 1,2,3,4. In cases where an equipment has a separate profile web page, sectionSerialNo will not be present \\ |
| 2 | profileId \\ | int \\ | 10 \\ | unique id of profile web page | Yes \\ | No \\ | \- \\ | \- \\ | profile id from the url |
| 3 | profileTypeId \\ | int \\ | 10 \\ | unique id of the type of profile | Yes \\ | No \\ | \- \\ | 5 \\ | For six different types of profiles there will be six different unique id ; Faculty profile - 1 , Research center profile - 2, Technology profile - 3, Facility profile - 4, Equipment profile - 5, Lab profile - 6 |
| 4 | institutionId \\ | int \\ | 10 \\ | id of the respective institution \\ | Yes \\ | No \\ | \- \\ | \- \\ | There are 9 academic institutions in the partnership; University of Texas at Arlington : 1, University of Texas - Pan American : 2, University of North Texas Health Science Center : 3, University of Texas at El Paso : 4, University of Texas at San Antonio : 5, University of Texas at Tylor : 6, UT Health Science Center - San Antonio : 7, University of North Texas : 8, University of Texas at Dallas : 9 \\ |
| 5 | name \\ | text \\ | \- \\ | name of the asset \\ | Yes \\ | No \\ | name of the equipment retrieved from the web page \\ | Condensation Test Section | \\ |
| 6 | data \\ | text \\ | \- \\ | | complete html of the web page is crawled and stored in this field | | Yes \\ | No \\ | \- \\ | \- \\ | \\ |
| 7 | contactName \\ | varchar \\ | 255 \\ | name of the primary contact person of this asset | No \\ | No \\ | full name of the person  retrieved from the web page | Dr. Rao, K R. | |
| 8 \\ | contactPhone \\ | varchar \\ | 45 \\ | contact phone of the primary contact person of this asset | No \\ | No \\ | phone No retrieved from the web page | 817-272-3478 | |
| 9 \\ | contactEmail \\ | varchar \\ | 255 \\ | email of the primary contact person of this asset | No \\ | No \\ | email retrieved from the web page | rao@uta.edu | |
| 10 \\ | address \\ | varchar \\ | 255 \\ | physical address location of this asset \\
\\ | Yes \\ | No \\ | physical address retrieved from the web page \\ | 701 South Nedderman Drive | If address is not found in the web page, assign the default address of the parent institution |
| 11 \\ | city \\ | varchar \\ | 45 \\ | city where asset is located \\ | Yes \\ | No \\ | city name retrieved from web page | Arlington \\ | If city name is not found in the web page, assign the default city name of the parent institution |
| 12 \\ | state \\ | varchar \\ | 45 \\ | state where asset is located \\ | Yes \\ | No \\ | state name retrieved from web page | TX \\ | If state name is not found in the web page, assign the default state name of the parent institution |
| 13 \\ | zip \\ | varchar \\ | 45 \\ | zip code of the asset \\ | Yes \\ | No \\ | zip code retrieved from web page \\ | 76019 \\ | If zip code is not found in the web page, assign the default zip code  of the parent institution |
| 14 | country \\ | varchar \\ | 45 \\ | country where the asset is located \\ | Yes \\ | No \\ | currently only USA \\ | USA \\ | |
| 15 \\ | lastUpdated \\ | datyetime \\ | | date & time of last update done on this profile | Yes \\ | No \\ | yyyy-mm-dd hh:mi:ss (24 hr) | 2009-05-27 09:11:49 | This information is parsed from top right corner of the web page |

h3.


h3. Storage Structure of Asset "Lab"

|| \# || Data Element || Type \\ || Size \\ || Description \\ || Required \\ || Multivalued \\ || Structure \\ || Example \\ || Note \\ ||
| 1 \\ | id \\ | varchar \\ | 45 \\ | unique id that can differentiate an entity throughout the system \\ | Yes \\ | No \\ | institutionId.profileTypeId.profileId \\ | "1.6.1913" \\ | |
| 2 \\ | profileId \\ | int \\ | 10 \\ | unique id of profile web page \\ | Yes \\ | No \\ | \- \\ | \- \\ | profile id from the url |
| 3 \\ | profileTypeId \\ | int \\ | 10 \\ | unique id of the type of profile \\ | Yes \\ | No \\ | \- \\ | 6 \\ | For six different types of profiles there will be six different unique id ; Faculty profile - 1 , Research center profile - 2, Technology profile - 3, Facility profile - 4, Equipment profile - 5, Lab profile - 6 \\ |
| 4 \\ | institutionId \\ | int \\ | 10 \\ | id of the respective institution \\ | Yes \\ | No \\ | \- \\ | \- \\ | There are 9 academic institutions in the partnership; University of Texas at Arlington : 1, University of Texas - Pan American : 2, University of North Texas Health Science Center : 3, University of Texas at El Paso : 4, University of Texas at San Antonio : 5, University of Texas at Tylor : 6, UT Health Science Center - San Antonio : 7, University of North Texas : 8, University of Texas at Dallas : 9 \\ |
| 5 \\ | name \\ | text \\ | \- \\ | name of the profile \\ | Yes \\ | No \\ | name of the laboratory or research groups from the web page     \\ | Center for Renewable Energy, Science & Technology | |
| 6 \\ | data | text \\ | \- | complete html of the web page is crawled and stored in this field | Yes \\ | No \\ | \- \\
\\ | \- \\ | |
| 7 \\ | contactName | varchar \\ | 255 \\ | name of the primary contact person of this asset \\ | No \\ | No \\ | full name of the person  retrieved from the web page \\ | Dr. Aswath, Pranesh \\ | |
| 8 \\ | contactPhone \\ | varchar \\ | 45 \\ | contact phone of the primary contact person of this asset \\ | No \\ | No \\ | phone No retrieved from the web page \\ | \+1 817 272 7108 | |
| 9 \\ | contactEmail \\ | varchar \\ | 255 \\ | email of the primary contact person of this asset \\ | No \\ | No \\ | email retrieved from the web page \\ | aswath@uta.edu | |
| 10 \\ | address \\ | varchar \\ | 255 \\ | physical address location of this asset \\ | Yes \\ | No \\ | physical address retrieved from web page \\ | 500 West FIrst Street, Rm. 325 | If address is not found in the web page, assign the default address of the parent institution |
| 11 \\ | city \\ | varchar \\ | 45 \\ | city where the asset is located \\ | Yes \\ | No \\ | city name retrieved from web page \\ | Arlington \\ | If city name is not found in the web page, assign the default city name of the parent institution \\ |
| 12 \\ | state \\ | varchar \\ | 45 \\ | state where the asset is located \\ | Yes \\ | No \\ | state name retrieved from web page \\ | TX \\ | If state name is not found in the web page, assign the default state name of the parent institution |
| 13 \\ | zip \\ | varchar \\ | 45 \\ | zip code of the asset \\ | Yes \\ | No \\ | zip code retrieve from web page \\ | 76019 \\ | If zip code is not found in the web page, assign the default zip code  of the parent institution |
| 14 \\ | country \\ | varchar \\ | 45 \\ | country where the asset is located \\ | Yes \\ | No \\ | currently only USA \\ | USA \\ | |
| 15  | lastUpdated \\ | datetime \\ | | date & time of last update done on this profile | Yes \\ | No \\ | yyyy-mm-dd hh:mi:ss (24 hr) \\ | 2008-12-20 11:33:21 | This information is parsed from top right corner of the web page \\ |