Course 1 Lessons
Finding Other Sources of Dark Data
It may seem like dark data resides only in digital assets, but in most organizations it also resides in the documents and spreadsheets that knowledge workers generate daily for their specific usage. These artifacts containing data from a variety of sources in an organization contain organizational information that only the knowledge workers and their teams utilize. These types of one-off spreadsheets, which may or may not roll up to leadership, are not usually connected to any internal systems. Reliance on these one-off documents limits the performance of the enterprise and undermines the collection of knowledge, which exacerbates the challenges posed by dark data. In addition, these one-off documents are not systematically governed, and therefore they can contain personal or team-centered bias that render their value questionable.
We should always begin our work on a new project by focusing on the question of what impact the end deliverable will have. When we’re working on a high-priority project, designing a solution for dark data is almost always going to be imperative. Typically, when large-scale deliverables are in the build process, the MVP of the solution will be simple. The subsequent releases of the deliverable will have more dark data integrated into them, which will provide advanced capabilities and better insights. However, in development of the MVP version we should always prepare for future iterations and build the foundation of the solution so that it is compatible with the knowns and unknowns.
When building data products that focus primarily on intelligence, approach the build with the frame of mind that there will be change and that it will grow over time as more is known. Building for scale is critical, and it is important that the analysis process doesn’t lead to a reduction of project velocity. Balancing data discovery by using information already on hand is critical to the persistence of forward momentum throughout the project delivery process. The following diagram demonstrates various forms of intelligence that can be created by approaching problems in this manner.
This dashboard displays full-year performance data for a company’s total acquired customers and efficacy of acquisition channels. This type of view shows totals for all channels and each individual channel and demonstrates that acquisitions as a whole is performing nearly exactly as planned and that the efficacy of all channels combined are right on target. However, diving into the individual channels and comparing the results to the targets as efficacy relates to acquisitions makes it clear there is room for improvement.
The total balances out because some channels are overperforming and others are underperforming. Ideally, we’d want to delve even deeper and understand the causes for the underperforming channels. The acquisitions chart and the efficacy gap chart reveal that referrals and in-store channels have decent acquisitions numbers but are not as effective as they should be based on the volume of traffic that each channel is getting. Looking more closely at each channel and location can bring to light where performance is lacking and allow corrective action to be taken.