The challenge
A private medical research university sought assistance in reviewing multiple healthcare applications to identify account access exceptions, or users not trained to handle health data. The analysis led to the identification and confirmation of user access for select data systems with access to electronic protected health information records (ePHI) covered by HIPAA.
The evaluation required extensive data management and analysis efforts across a number of disparate applications and systems. This process includes the collection, standardization, integration, and analysis of user application files and match-based queries to an approved list of users with permission to access ePHI within university-owned applications.
While the standard approach was effective, it was operationally inefficient, unresponsive to changing requirements, and didn’t provide visualized insights:
- A manual process of data collection, management, and analysis reduced the overall efficiency of account access reviews and confirmation. Relying on a manual data import process via Excel for each user application file meant that each file needed to be “cleaned” and standardized individually based on common file parameters
(e.g. Banner ID numbers, Active Directory identifiers, etc.) each time the file was received from the client. After a file was standardized, it was individually compared against a list of approved users with access to each application containing ePHI. This process was repeated for each user application file, and was repeated again if a newer version of the user application file was provided by the client. This manual process also increased the potential risk of data errors based on version control errors. - The use of multiple software systems made it difficult to generate a unique ID for each application user across all systems. Since each software system retained different data elements for its users, generating a unique ID for comparisons to the list of approved users with access to ePHI, and to identify duplicate users within additional systems, was a burdensome process.
- With limited interconnectivity between each user application file, it was difficult to generate dynamic and insightful analyses to identify root causes of access errors to ePHI. The manual process of data collection, management, and analysis resulted in limited capacity to design queries for further statistical analysis about certain applications and user roles, or to effectively respond to ad hoc analysis requests from the university.
The solution
Our cross-functional team of cybersecurity and data analytics experts designed an improved analytics framework to automate the process of data collection, management, and analysis. Leveraging advanced analytical tools including Alteryx, SQL, and Tableau, we greatly improved their efficiency, responsiveness, and ability to produce insightful analysis. The new analytics framework was divided into three phases:
1. Data importing, cleaning, and standardization
- Each application user file was imported into Alteryx.
- The data was cleaned & transformed to improve data quality.
- Based on the available data elements, unique IDs were generated for each user, & five standardized data elements were retained for each user.
- All user application files were merged together into a unified application user data file & exported to a secure SQL server database.
2. Query-based matching method within secure SQL server database
- The ePHI Permitted Users list was imported to a secure SQL server database.
- Within this SQL environment, 25 separate queries were executed in order to determine whether each user was permitted to have access to ePHI data. These queries tested multiple variations of standardized data elements to minimize the occurrence of false-positive results.
- The final results were stored within the SQL server database.
3. Generation of department-level reports and dynamic analysis
- The list of users that were identified as incorrectly having access to ePHI-related data systems was compiled & provided to each applicable university department for remediation.
- An executive dashboard was created using user access data generated throughout the analysis process. A Sankey flow diagram provides IT leaders with a compressive view into which departments and applications resulted in higher rates of user access errors. Additionally, an application scorecard was created to provide drill-down capabilities for root-cause analysis.
Additionally, throughout the process, department leader feedback was integrated to improve the accuracy and timeliness of the analysis.
The benefit
This improved approach allows the university to spend time on value add, risk remediation activities versus manual data processing. This shift from tactical tasks to strategic advising results in a sustainable, data-driven solution. The university can now better understand the root causes that lead to user access errors, and increase the speed risks are remediated.