• Follow us


How to build an agile data pipeline

Agility and data are two of the most overused buzzwords of the business community – and for good reason.

Every business wants to be agile, to be responsive to the changing environment, to survive and thrive. Likewise, forward-thinking businesses are majorly focused on data as a route to greater insights, creativity and efficiency. It seems buzzword squared to put these two concepts together, but rather than being a technology to hype, it refers to a smarter way of managing with what enterprises already have, or with readily acquired skills.

An agile data pipeline is what data-centric organisations are putting in place in order to make the best use out of their data investments and ensure that the business can incorporate data-led analytical decision-making in a healthy and sustainable way.

As with any business process, building an agile pipeline involves several stages and should properly encompass a range of appropriate stakeholders within the business. As it is, that’s not always the case as many organisations tend to develop their analytics functions in a higgledy-piggledy manner.

It’s no surprise that the data estate of a business can quickly grow out of control – the four Vs of big data, as defined by IBM are the variety, velocity, volume, and veracity of big data and show that data is no monolithic thing. It’s a living, changing entity. So fluid in fact, that in 2017 Experian built on this format and added two more Vs: Vulnerability and value.

So how do you corral and harness the bucking bronco of data and put it behind the corporate plough, to turn up the nuggets of true insight?

A data catalogue makes storing, finding and using data a much more seamless experience. It’s an organised solution that allows business users to explore data sources and understand them. It saves the user time and can stop them recreating new data if they might have failed to find what they wanted in a non-catalogued state. It’s a great resource to keep the analytical process ticking over at speed, without slowing down the work of data scientists or ‘line of business’ analysts.

A faultless data catalogue doesn’t arrive fully formed, and the history of data governance integrations is littered with solutions that have failed to achieve a critical adoption in an organisation. To truly deliver on a data catalogue the business must also focus on the people and the process, not just the technology. Analytic leaders must build a culture that enables users to succeed with data.

Discover together

Data discovery can be fun, but it’s a hygiene factor that the analyst needs to get through before they can do the job they want to: Analytics, insights, and adding value to the business. Really, the organisation wants to unite all of the data workers with the data and analytic assets they could possibly (but legitimately!) need in a controlled and secure way. It’s important to take steps to make data both searchable and trackable. A platform will offer this and event data lineage, offering more visibility for better governance. When data discovery and data security are breathtakingly easy, there’s no room for data governance missteps. It’s a great first step before an enterprise can create a culture of collaboration, sharing, and innovation by extending formally tribal knowledge across the organisation.

Culture the data culture

The data catalogue is the starting point for most analytical activities. Searching and finding content, understanding context and gaining trust in the results through community feedback and interaction – it’s a great resource when it’s used correctly, saving time and energy, and greatly aiding productivity.

The success of the catalogue is tied into the success of the organisation. Track and reward the most active contributors who add value to the analytic process, understand the assets that are creating the most impactful results, and promote those users to ensure that information assets are well curated and maintained.

The right data culture is socially engaging. It empowers users to impart and share knowledge, and is supported by technology that supports the different ways that users bring their experience together to solve problems. This includes creating and annotating definitions, discussing quality and purpose in conversation threads, and even simple social gestures like sharing a link or giving a 'thumbs up' reinforce the value of the underlying asset and make it richer and easier to find for future users.

Collaborate or die!

It might be that during the course of the pre-data-focused days others in the organisation have already collected the same information or performed a similar analysis, but different analysts have no good way of finding it. Data assets and resulting information proliferate, thus compounding the problem and creating inefficiencies and delays in answering critical business questions.

Taking a cue from social media and wiki techniques, social interactions can help users share and utilise organisational tribal knowledge easily. And everything in the analytic process: Data, analytic apps, workflows, macros, visualisations, and dashboards, should be sharable. When everything is seamlessly shareable and it is fast work to identify trusted information assets as well as insights into how they are used and lineage, it’s very simple to make more impactful business decisions.

One of the most important pieces to this is closing the gap not only around finding the right data but around the roles within an organisation: Between IT, business analysts, data scientists, everyday ‘citizen data scientists’, and onwards to all who use data. Sharing across an organisation is the grease to the wheels of innovation.

Define the best working practices

From the moment you embark on analytics project you stand at a base camp with the peak of expectations staring at you from across the chasm of ignorance. Building a social repository of all the organisation's data sources, reports, workflows, terminology, and more (potentially thousands of lifetimes of accumulated knowledge) is as daunting as climbing Mount Everest. So, don't.

Start small, but think big. Tackle smaller challenges to get some early victories and build momentum from there.

Pick a single department or project. Perhaps start with a handful of critical datasetsDocument expertise while reports and data sources are being created, before the skills and the knowledge leaves the project (or the company!) Ensure that new people can understand the function of dashboards, reports other datasetsFollow your business strategy: Document and socialise the assets associated with key strategic projects, and use the catalogue as a means to change the culture towards greater collaborationTo ensure adoption, it’s vital that users find the information always up-to-date. Without timeliness, the catalogue immediately loses trust and credibility and the pipeline starts to leakA business glossary is a critical component of your data strategy. A glossary can take many forms: definitions, concepts, subject areas, etc. It captures the unique language of your organisation in a central location, and then connects that meaning with the contents of the catalogueA proper analytics pipeline lives-or-dies on whether users find value in the information within. There is no-one central to the organisation, not even BI and IT teams that have a 100 per cent understanding of all those data sources, data sets, and reports and other types of assets. This expertise and 'know-how' is in the heads of staff: Business teams, analysts, knowledge workers, analytics groups, and more. It's pervasive and waiting to be harnessedTrusting data

It’s one thing to have data, it’s another to trust it and use it properly. Famously executives relied on their experience, their ‘gut’, when making decisions, and sometimes, that’s not a necessarily a bad idea. Where data is not cleaned, rated and trusted, it might not be worth the time to review. But where the right steps are in place the data can tell a very honest and trustworthy story. It is a better resource than the thoughts and opinions of an executive who may not have access to all the facts, the long-term trends, or the powerful analytical ability to correlate all their contents appropriately.

So to stock the data pipeline put in place some simple best practices, encourage your people with good processes and give them the technology that makes this all easy. We’re not in the days of needing to know how code to operate analytical tools, and end-to-end platforms take out the sting of finding, moving, prepping and using data. In fact, stocking the analytics pipeline should be a breeze, exhilarating, process, the opening stages in a virtuoso performance by a data maestro.

Nick Jewell, Director of Product Strategy, AlteryxImage source: Shutterstock/alexskopje

Read More

Leave A Comment

More News

Latest ITProPortal news

Ryuk ransomware "still going strong" 2019-02-20 11:00:19Multiple groups still using Ryuk to extort money from companies.

Keep your business centre operations running 24/7 with 2019-02-20 08:00:40Reboot to restore solutions help IT admins take a preventive approach to computer management at business centres, thus enhancing the availability and

Microsoft uncovers major hacking attempts against EU organisations 2019-02-20 07:30:44Firms across Europe were hit in the attacks.

Qualcomm unveils most powerful 5G modem 2019-02-20 07:00:06Second-generation X55 modem will hopefully power the first 5G smartphones.

12 billion devices will be internet-connected by 2022 2019-02-20 06:30:28Up to four billion IoT devices will be online soon, Cisco estimates.

UK companies still worried about cyber risks 2019-02-20 06:00:38They fear 5G, but they're willing to invest.

Don’t let the tech takeover: Time rich, mindfulness 2019-02-20 06:00:22With today’s data-driven on-demand economy, we are winning back some of that precious time. But are we getting the most out of it?

The technology trust gap that’s hurting sales efforts 2019-02-20 05:30:02Here are my five key steps to get salespeople onboard with technology projects:

Why hackers love mainframe passwords – and what 2019-02-20 05:00:37Why are IBM’s mainframe customers seemingly reluctant to upgrade their security by incorporating multi-factor authentication?

Reflecting on data privacy for 2019 – Why 2019-02-20 04:30:11Below, six industry experts give their take on why data security needs to be at the heart of operations, and their opinions on what can be done to ens

Shipping on the cusp of a digital wave 2019-02-20 04:00:42Despite its significance, the industry still remains largely untouched by digital transformation and efficiencies it can bring.

Microsoft Surface Go review 2019-02-19 12:19:33An ideal pocket-sized budget work companion, but don't expect anything earth-shattering.

TechRadar: Internet news

The Samsung Galaxy Fold just changed the future 2019-02-20 20:04:54The Fold is too expensive, weird and thick for the mainstream… but this is just the beginning.

Best security camera: keep an eye on your 2019-02-20 19:53:36We've collected together all of the best smart security cameras for keeping your house safe when you're not around.

YouTube TV: Everything you need to know about 2019-02-20 19:42:38Watch out cable, YouTube TV is here to liberate the contract-bound masses. Here's everything you need to know.

Best running headphones 2019: our top 10 choices 2019-02-20 19:17:39From tarmac to trail, the best running headphones will keep your tunes going right up to the finish line.

Best Samsung Galaxy S10e pre-order plans and prices 2019-02-20 19:14:50Samsung's Galaxy S10e is supposedly its more affordable offering, but you can save even more with these plans

Best Samsung Galaxy S10 Plus pre-order plans and 2019-02-20 18:50:12The larger of Samsung's Galaxy S10 phones obviously costs the most, so here's how you can nab it for less.

Samsung's new Galaxy Fit and Fit E are 2019-02-20 18:47:15If you're going to release a fitness tracker these days, you need something exciting... but only the price might attract you.

Best Samsung Galaxy S10 pre-order plans and prices 2019-02-20 18:36:04Samsung's latest flagship will no doubt be its best to date, and here's how you can ensure you get your hands on it.

Remote code execution vulnerability discovered in WordPress 2019-02-20 18:31:48Researchers have discovered a critical flaw that could allow hackers to gain complete control over a user's WordPress blog.

Samsung Galaxy S10 Plus vs Samsung Galaxy Note 2019-02-20 18:20:13The Galaxy S10 Plus has a screen the same size as the Galaxy Note 9, but what else is similar?

Best Samsung Galaxy S10 outright prices in Australia: 2019-02-20 18:14:28Samsung's next flagship series, the Galaxy S10, has been revealed – here's how you can secure your pre-order.

Here's everything that launched at Samsung Unpacked 2019 2019-02-20 17:25:39Samsung just announced a huge number of new devices, so we've rounded up the info you need on them all right here.

TechCrunch » Enterprise

As GE and Amazon move on, Google expands 2019-02-15 13:06:48NYC and Boston were handed huge setbacks this week when Amazon and GE decided to bail on their commitments to build headquarters in the respectiv

Zendesk just hired three former Microsoft, Salesforce and 2019-02-14 17:19:22Today, Zendesk announced it has hired three new executives — Elisabeth Zornes, former general manager of global support for Microsoft Office, as

Peltarion raises $20M for its AI platform 2019-02-14 14:46:30Peltarion, a Swedish startup founded by former execs from companies like Spotify, Skype, King, TrueCaller and Google, today announced that it has rais

AWS announces new bare metal instances for companies 2019-02-14 10:42:36When you think about Infrastructure as a Service, you typically pay for a virtual machine that resides in a multi-tenant environment. That means, it&r

Zoho’s office suite gets smarter 2019-02-14 09:00:17As far as big tech companies go, Zoho is a bit different. Not only has it never taken any venture funding, it also offers more than 40 products that r

Google says it’ll invest $13B in US data 2019-02-13 13:39:30Google today announced that it will invest $13 billion in data centers and offices across the U.S. in 2019. That’s up from $9 billion in investm

Block Kit helps deliver more visually appealing content 2019-02-13 11:03:20Slack has become a critical communications tool for many organizations. One of the things that has driven its rapid success has been the ability to co

Fiverr acquires ClearVoice to double down on content 2019-02-13 08:00:50Fiverr is acquiring ClearVoice, a company that helps customers like Intuit and Carfax find professionals to write promotional content. The two compani

Google and IBM still trying desperately to move 2019-02-12 17:14:57When it comes to the cloud market, there are few known knowns. For instance, we know that AWS is the market leader with around 32 percent of market sh

Donde Search picks up $6.5 million to help 2019-02-12 17:00:23Donde Search has just closed a $6.5 million Series A investment led by Matrix Partners, with participation from previous investors such as senior lead

Glide helps you build mobile apps from a 2019-02-12 11:10:51The founders of Glide, a member of the Y Combinator Winter 2019 class, had a notion that building mobile apps in the enterprise was too hard. They dec

Datadog acquires app testing company Madumbo 2019-02-12 10:00:26Datadog, the popular monitoring and analytics platform, today announced that it has acquired Madumbo, an AI-based application testing platform. &ldquo

Disclaimer and Notice:WorldProNews.com is not responsible of these news or any information published on this website.