DataGov360 Diagnostic
Data management requirements are defined in silos, by business functions involved in managing each of the data asset lifecycle phases.
Certain business functions across the agency collaborate with each other to define common data management requirements and manage data across their silos.
Data management requirements are defined by the organization and formally documented as organization information requirements. They are used to create asset, project, and employer information requirements.
In the multiple-editor stewardship DMM, there is no workflow, versioning, history, or overwrite controls. For multiple-supplier data aggregation DMM, the process is manual, involving complicated transformations, spreadsheets, and human manual notifications and communications. In the Govern and conflate change requests model, requests are managed through manual workflow batches with manual updates. Derivative product generation DMM relies on a manual batch process, producing data products only upon user request.
In the multiple-editor stewardship DMM, there is blended support for check-in/out, workflow approval, versioning, and temporal data management. Multiple-supplier data aggregation DMM uses semiautomatic ETL processes with a registry of scripts and manual discrepancy resolution. Govern and conflate change requests are handled by a separate workflow transaction system that processes review and approval of changes. Derivative product generation is facilitated by a semiautomated tool that generates updated areas or content based on project plans or a calendar.
The multiple-editor stewardship model includes fully integrated workflow approval, versioning, and temporal data management. Multiple-supplier data aggregation involves automatic notifications based on change detection, data pipeline integration (ELT/ETL/API), and automated processing, loading, QA/QC, transformation, publishing, and error reporting. Govern and conflate change requests are managed by a rules-based engine for workflow processing and approvals, automatically integrated in transactions. Derivative product generation automatically detects the need for generating derivative data products.
Data supply platform decisions and management are not coordinated at the enterprise level. Business units make their own purchase and data management decisions.
Business units are participating in enterprise IT governance on purchasing for data center or low-level IaaS. Agency is not investing in collaborative data supply architecture to take advantage of emerging data technologies.
An active subcommittee and/or peer architecture governance has been established at the enterprise level so that it can collaborate closely and within enterprise IT governance on the full lifecycle of data supply platforms.
Data assets in the portfolio are classified based on the MIS needs and use cases of individual business functions (e.g., planning, design, asset management, etc.). The classification is not based on the features, purpose, use, and domains of knowledge associated with the data asset. As a result, data asset governance policies cannot be created at the enterprise level across business functions.
An open standards-based classification system for highway infrastructure assets has been adopted by certain business functions in the organization. Business functions have federated these classification systems to facilitate data governance policy development and data management (e.g., data indexing, exchange, quality assessment, analysis).
The classification system adopted by the agency spans multiple asset infrastructure concepts (e.g., rail, road, pipelines, transit). This allows for data governance and management policies and rules and standards to be created and leveraged across various knowledge domains.
Object types have been identified corresponding to the infrastructure assets based on the MIS system. The object types have been mapped to the MIS-based classification system.
MIS-based OTLs have been federated to ensure they can be attached to the same object type across MIS systems. That is, the MIS systems have adopted a common definition for the object types associated with infrastructure assets. Object-type data are semantically described.
Autonomous OTLs in MIS are federated with an enterprise OTL that has been created using open standards. The object types in the enterprise OTL have been mapped to an open standards-based classification system, which spans multiple infrastructure concepts. OTLs in the MIS can be semantically linked to identify and relate information.
The metadata associated with object types are managed for each MIS system. Object-type metadata are not managed at the enterprise level in an enterprise data dictionary. Therefore, there is no consistent definition of the object-type semantics at the enterprise level.
Metadata associated with the object types have been aligned across MIS systems and business functions to facilitate data exchange. A federated enterprise data dictionary has been created at the enterprise level to store information about object-type semantics.
Object-type semantics are defined in the enterprise data dictionary across asset infrastructure concepts (e.g., rail, road, pipelines, transit). Version control has been enabled on the object types in the autonomous systems used by each business function and at the enterprise level for the open standards-based enterprise OTL.
Business lines do not feel responsible for data quality, seeing it as an IT problem, with no specific policy and poor internal data trust, leading to experience-based decisions. No official agreements with third parties exist, and any informal agreements are not recognized by top management.
A DQM framework has been defined and invested in with policy, responsibilities, metrics, and goals and means to follow objectives. Data are collected under a controlled process. Business lines agree that they are part of the data process.
DQM is embedded in the data governance culture with ongoing assessments and pilot automated models, including RPA and AI. Data quality indicators are monitored according to policy, with shortcomings identified and mitigated until objectives are achieved.
No assessments support data use outside the originating functional unit, with limited understanding of the data asset inventory. Data quality issues are identified too late, causing inconsistent results, and solutions for external data needs are disruptive and inefficient.
Data assessment has been piloted for core assets, identifying quality issues but without addressing the root cause. Local workarounds exist, and data models are mapped and used by transformation tools. Readiness assessment inventory efforts are captured in portfolio management tools.
All prioritized data assets have been assessed, with improvement recommendations backlogged and aligned to use roadmaps. Business lines are enhancing data quality through organized capabilities and capital programs linked to data improvements, adding value to core activities. Agreements with third parties are signed, with regular performance measured and managed.
Data assets are published for single use, not for self-service discovery, and are not easily accessible to other units, complicating cross-functional interoperability. Data reusability is hampered as original creation does not consider requirements beyond compliance and system needs.
FAIR principles are applied to high-priority core data assets. Internal data trust depends on identifying known issues. While no formal agreements exist with third parties, solutions are in place for data mismatches. Data lineages are mostly identified, with key segments included in business processes but not always through data stores.
Trust and satisfaction in data FAIRness is high. Most datasets have high FAIR scores. Data assets have moved to delivery platforms, and investments in delivery platform services allow for maintaining high FAIR scores.