Modern big data processing technologies. What is Big data: we collected all the most important things about big data. Myths and misconceptions about Big Data

What's happened Big Data (literally - big data)? Let's look first at the Oxford Dictionary:

Data- quantities, signs or symbols that a computer operates and that can be stored and transmitted in the form of electrical signals, recorded on magnetic, optical or mechanical media.

Term Big Data used to describe a large data set that grows exponentially over time. To process such a quantity of data, you cannot do without.

The benefits that Big Data provides:

  1. Collecting data from various sources.
  2. Improving business processes through real-time analytics.
  3. Storing huge amounts of data.
  4. Insights. Big Data is more insightful hidden information using structured and semi-structured data.
  5. Big data helps you reduce risk and make smart decisions with the right risk analytics

Big Data Examples

New York Stock Exchange generates daily 1 terabyte trading data for the past session.

Social media: Statistics show that Facebook databases are uploaded every day. 500 terabytes new data is generated mainly due to uploading photos and videos to social network servers, messaging, comments under posts, and so on.

Jet engine generates 10 terabytes data every 30 minutes during the flight. Since thousands of flights are made every day, the volume of data reaches petabytes.

Big Data classification

Big data forms:

  • Structured
  • Unstructured
  • Semi-structured

Structured form

Data that can be stored, accessed and processed in a form with a fixed format is called structured. Over a long period of time computer science reached great success in improving techniques for working with this type of data (where the format is known in advance) and learned how to benefit. However, today there are already problems associated with the growth of volumes to sizes measured in the range of several zettabytes.

1 zettabyte equals a billion terabytes

Looking at these numbers, it is not difficult to verify the veracity of the term Big Data and the difficulties associated with processing and storing such data.

Data stored in a relational database is structured and looks like, for example, tables of company employees

Unstructured form

Data of unknown structure is classified as unstructured. In addition to its large size, this shape is characterized by a number of difficulties in processing and removal. useful information. A typical example of unstructured data is a heterogeneous source containing a combination of simple text files, pictures and videos. Today, organizations have access to large amounts of raw or unstructured data, but do not know how to extract value from it.

Semi-structured form

This category contains both of those described above, so semi-structured data has some form but is not actually defined by tables in relational databases. An example of this category is personal data presented in an XML file.

Prashant RaoMale35 Seema R.Female41 Satish ManeMale29 Subrato RoyMale26 Jeremiah J.Male35

Characteristics of Big Data

Big Data growth over time:

Blue color represents structured data (Enterprise data), which is stored in relational databases. Other colors indicate unstructured data from various sources (IP telephony, devices and sensors, social networks and web applications).

According to Gartner, big data varies in volume, rate of generation, variety, and variability. Let's take a closer look at these characteristics.

  1. Volume. The term Big Data itself is associated with large size. Data size is a critical metric in determining the potential value to be extracted. Every day, 6 million people use digital media, generating an estimated 2.5 quintillion bytes of data. Therefore, volume is the first characteristic to consider.
  2. Diversity- the next aspect. It refers to heterogeneous sources and the nature of data, which can be either structured or unstructured. Previously, spreadsheets and databases were the only sources of information considered in most applications. Today, data in the form of emails, photos, videos, PDF files, and audio are also considered in analytical applications. This variety of unstructured data leads to problems in storage, mining and analysis: 27% of companies are not confident that they are working with the right data.
  3. Generation speed. How quickly data is accumulated and processed to meet requirements determines potential. Speed ​​determines the speed of information flow from sources - business processes, application logs, social networking and media sites, sensors, mobile devices. The flow of data is huge and continuous over time.
  4. Variability describes the variability of data at some points in time, which complicates processing and management. For example, most data is unstructured in nature.

Big Data analytics: what are the benefits of big data

Promotion of goods and services: Access to data from search engines and sites like Facebook and Twitter allows businesses to more accurately develop marketing strategies.

Improving service for customers: Traditional customer feedback systems are being replaced by new ones that use Big Data and Natural Language Processing to read and evaluate customer feedback.

Risk calculation associated with the release of a new product or service.

Operational efficiency: big data is structured in order to quickly extract the necessary information and quickly produce accurate results. This combination of Big Data and storage technologies helps organizations optimize their work with rarely used information.

Big data (or Big Data) is a set of methods for working with huge volumes of structured or unstructured information. Big data specialists process and analyze it to obtain visual, human-perceivable results. Look At Me talked to professionals and found out what the situation is with big data processing in Russia, where and what is best to study for those who want to work in this field.

Alexey Ryvkin about the main trends in the field of big data, communication with customers and the world of numbers

I studied at the Moscow Institute of Electronic Technology. The main thing I managed to take away from there was fundamental knowledge in physics and mathematics. Simultaneously with my studies, I worked at the R&D center, where I was involved in the development and implementation of noise-resistant coding algorithms for secure data transmission. After finishing my bachelor's degree, I entered the master's program in business informatics at the Higher School of Economics. After that I wanted to work at IBS. I was lucky that at that time, due to a large number of projects, there was an additional recruitment of interns, and after several interviews I began working at IBS, one of the largest Russian companies in this field. In three years, I went from an intern to an enterprise solutions architect. Currently I am developing expertise in Big Data technologies for customer companies from the financial and telecommunications sectors.

There are two main specializations for people who want to work with big data: analysts and IT consultants who create technologies to work with big data. In addition, we can also talk about the profession of Big Data Analyst, i.e. people who directly work with data, with the customer’s IT platform. Previously, these were ordinary mathematical analysts who knew statistics and mathematics and used statistical software to solve data analysis problems. Today, in addition to knowledge of statistics and mathematics, an understanding of technology and the data life cycle is also necessary. This, in my opinion, is the difference between modern Data Analysts and those analysts who came before.

My specialization is IT consulting, that is, I come up with and offer clients ways to solve business problems using IT technologies. People with different experiences come to consulting, but the most important qualities for this profession are the ability to understand the needs of the client, the desire to help people and organizations, good communication and team skills (since it is always working with the client and in a team), good analytical skills. Internal motivation is very important: we work in a competitive environment, and the customer expects unusual solutions and interest in work.

Most of my time is spent communicating with customers, formalizing their business needs and helping them develop the most suitable technology architecture. The selection criteria here have their own peculiarity: in addition to functionality and TCO (Total cost of ownership), non-functional requirements for the system are very important, most often these are response time and information processing time. To convince the customer, we often use a proof of concept approach - we offer to “test” the technology for free on some task, on a narrow set of data, to make sure that the technology works. The solution should create a competitive advantage for the customer by obtaining additional benefits (for example, x-sell, cross-selling) or solve some kind of business problem, say, reduce the high level of loan fraud.

It would be much easier if clients came with a ready-made task, but so far they do not understand that a revolutionary technology has appeared that can change the market in a couple of years

What problems do you face? The market is not yet ready to use big data technologies. It would be much easier if clients came with a ready-made task, but so far they do not understand that a revolutionary technology has appeared that can change the market in a couple of years. This is why we essentially work in startup mode - we don’t just sell technologies, but every time we convince clients that they need to invest in these solutions. This is the position of visionaries - we show customers how they can change their business using data and IT. We are creating this new market - the market for commercial IT consulting in the field of Big Data.

If a person wants to engage in data analysis or IT consulting in the field of Big Data, then the first thing that is important is a mathematical or technical education with good mathematical training. It is also useful to master specific technologies, for example SAS, Hadoop, R language or IBM solutions. In addition, you need to actively take an interest applied problems for Big Data - for example, how it can be used for improved credit scoring in a bank or customer lifecycle management. This and other knowledge can be obtained from available sources: for example, Coursera and Big Data University. There is also the Customer Analytics Initiative at Wharton University of Pennsylvania, where a lot of interesting materials have been published.

A major problem for those who want to work in our field is the clear lack of information about Big Data. You cannot go to a bookstore or some website and get, for example, a comprehensive collection of cases on all applications of Big Data technologies in banks. There are no such directories. Some of the information is in books, some is collected at conferences, and some you have to figure out on your own.

Another problem is that analysts are comfortable in the world of numbers, but they are not always comfortable in business. These people are often introverted and have difficulty communicating, making it difficult for them to communicate research findings convincingly to clients. To develop these skills, I would recommend books such as The Pyramid Principle, Speak the Language of Diagrams. They help develop presentation skills and express your thoughts concisely and clearly.

Participating in various case championships while studying at the National Research University Higher School of Economics helped me a lot. Case championships are intellectual competitions for students where they need to study business problems and propose solutions to them. There are two types: case championships of consulting firms, for example, McKinsey, BCG, Accenture, as well as independent case championships such as Changellenge. While participating in them, I learned to see and solve complex problems - from identifying a problem and structuring it to defending recommendations for its solution.

Oleg Mikhalsky about the Russian market and the specifics of creating a new product in the field of big data

Before joining Acronis, I was already involved in launching new products to market at other companies. It’s always interesting and challenging at the same time, so I was immediately interested in the opportunity to work on cloud services and data storage solutions. All my previous experience in the IT industry, including my own startup project I-accelerator, came in handy in this area. Having a business education (MBA) in addition to a basic engineering degree also helped.

In Russia, large companies - banks, mobile operators, etc. - have a need for big data analysis, so in our country there are prospects for those who want to work in this area. True, many projects now are integration projects, that is, made on the basis of foreign developments or open source technologies. In such projects, fundamentally new approaches and technologies are not created, but rather existing developments are adapted. At Acronis, we took a different route and, after analyzing the available alternatives, decided to invest in our own development, resulting in a system secure storage for big data, which is not inferior in cost to, for example, Amazon S3, but works reliably and efficiently and on a significantly smaller scale. Large Internet companies also have their own big data developments, but they are more focused on internal needs than meeting the needs of external clients.

It is important to understand the trends and economic forces that influence the field of big data. To do this, you need to read a lot, listen to speeches by authoritative experts in the IT industry, and attend thematic conferences. Now almost every conference has a section on Big Data, but they all talk about it from a different angle: from a technology, business or marketing point of view. You can go to project work or an internship at a company that is already leading projects on this topic. If you are confident in your abilities, then it is not too late to organize a startup in the field of Big Data.

Without constant contact with the market new development risks being unclaimed

It's true when you're in charge New Product, a lot of time is spent on market analytics and communication with potential clients, partners, professional analysts who know a lot about clients and their needs. Without constant contact with the market, a new development risks being unclaimed. There are always a lot of uncertainties: you have to figure out who the early adopters will be, what you have to offer them, and how to then attract a mass audience. The second most important task is to formulate and convey to developers a clear and holistic vision of the final product in order to motivate them to work in such conditions when some requirements may still change, and priorities depend on feedback coming from the first customers. Therefore, an important task is managing the expectations of clients on the one hand and developers on the other. So that neither one nor the other loses interest and brings the project to completion. After the first successful project, it becomes easier and the main challenge will be to find the right growth model for the new business.

It was predicted that the total global volume of data created and replicated in 2011 could be about 1.8 zettabytes (1.8 trillion gigabytes) - about 9 times more than what was created in 2006.

More complex definition

However` big data` involve more than just analyzing huge amounts of information. The problem is not that organizations create huge amounts of data, but that most of it is presented in a format that does not fit well with the traditional structured database format - web logs, videos, text documents, machine code or, for example, geospatial data. All this is stored in many different repositories, sometimes even outside the organization. As a result, corporations may have access to a huge amount of their data and lack the necessary tools to establish relationships between this data and draw meaningful conclusions from it. Add to this the fact that data is now being updated more and more frequently, and you get a situation in which traditional methods of information analysis cannot keep up with the huge volumes of constantly updated data, which ultimately opens the way for technology big data.

Best definition

In essence the concept big data involves working with information of a huge volume and diverse composition, very often updated and located in different sources in order to increase operational efficiency, create new products and increase competitiveness. The consulting company Forrester gives a brief formulation: ` Big Data brings together techniques and technologies that extract meaning from data at the extreme limits of practicality.

How big is the difference between business analytics and big data?

Craig Bathy, executive director of marketing and chief technology officer of Fujitsu Australia, pointed out that business analysis is a descriptive process of analyzing the results achieved by a business in a certain period of time, while the processing speed big data allows you to make the analysis predictive, capable of offering business recommendations for the future. Big data technologies also allow you to analyze more types of data than business intelligence tools, which makes it possible to focus on more than just structured repositories.

Matt Slocum of O'Reilly Radar believes that although big data and business analytics have the same goal (finding answers to a question), they differ from each other in three aspects.

  • Big data is designed to handle larger volumes of information than business analytics, and this certainly fits the traditional definition of big data.
  • Big data is designed to handle faster, faster-changing information, which means deep exploration and interactivity. In some cases, results are generated faster than the web page loads.
  • Big data is designed to process unstructured data that we are only beginning to explore how to use once we have been able to collect and store it, and we need algorithms and conversational capabilities to make it easier to find trends contained within these data sets.

According to the white paper "Oracle Information Architecture: An Architect's Guide to Big Data" published by Oracle, when working with big data, we approach information differently than when conducting business analysis.

Working with big data is not like the usual business intelligence process, where simply adding up known values ​​produces a result: for example, adding up paid invoices becomes sales for the year. When working with big data, the result is obtained in the process of cleaning it through sequential modeling: first, a hypothesis is put forward, a statistical, visual or semantic model is built, on its basis the accuracy of the put forward hypothesis is checked, and then the next one is put forward. This process requires the researcher to either interpret visual meanings or construct interactive queries based on knowledge, or develop adaptive `machine learning` algorithms that can produce the desired result. Moreover, the lifetime of such an algorithm can be quite short.

Big data analysis techniques

There are many different methods for analyzing data sets, which are based on tools borrowed from statistics and computer science (for example, machine learning). The list does not pretend to be complete, but it reflects the most popular approaches in various industries. It should be understood that researchers continue to work on creating new techniques and improving existing ones. In addition, some of the techniques listed do not necessarily apply exclusively to big data and can be successfully used for smaller arrays (for example, A/B testing, regression analysis). Of course, the more voluminous and diversified the array is analyzed, the more accurate and relevant data can be obtained as a result.

A/B testing. A technique in which a control sample is alternately compared with others. Thus, it is possible to identify the optimal combination of indicators to achieve, for example, the best consumer response to a marketing offer. Big Data allow for great amount iterations and thus obtain a statistically reliable result.

Association rule learning. A set of techniques for identifying relationships, i.e. association rules, between variables in large data sets. Used in data mining.

Classification. A set of techniques that allows you to predict consumer behavior in a certain market segment (purchase decisions, churn, consumption volume, etc.). Used in data mining.

Cluster analysis. A statistical method for classifying objects into groups by identifying common features that are not known in advance. Used in data mining.

Crowdsourcing. Methodology for collecting data from a large number of sources.

Data fusion and data integration. A set of techniques that allows you to analyze comments from social network users and compare them with sales results in real time.

Data mining. A set of techniques that allows you to determine the categories of consumers most susceptible to the promoted product or service, identify the characteristics of the most successful employees, and predict the behavioral model of consumers.

Ensemble learning. This method uses many predictive models, thereby improving the quality of the forecasts made.

Genetic algorithms. In this technique, possible solutions are represented in the form of `chromosomes', which can be combined and mutated. As in the process of natural evolution, the fittest individual survives.

Machine learning. A direction in computer science (historically it has been given the name “artificial intelligence”), which pursues the goal of creating self-learning algorithms based on the analysis of empirical data.

Natural language processing (NLP). A set of techniques for recognizing natural human language borrowed from computer science and linguistics.

Network analysis. A set of techniques for analyzing connections between nodes in networks. In relation to social networks, it allows you to analyze the relationships between individual users, companies, communities, etc.

Optimization. A set of numerical methods for redesigning complex systems and processes to improve one or more metrics. Helps in making strategic decisions, for example, the composition of the product line to be launched on the market, conducting investment analysis, etc.

Pattern recognition. A set of techniques with self-learning elements for predicting the behavioral model of consumers.

Predictive modeling. A set of techniques that allow you to create mathematical model a predetermined probable scenario for the development of events. For example, analysis of the CRM system database for possible conditions that will prompt subscribers to change providers.

Regression. A set of statistical methods for identifying a pattern between changes in a dependent variable and one or more independent variables. Often used for forecasting and predictions. Used in data mining.

Sentiment analysis. Techniques for assessing consumer sentiment are based on natural language recognition technologies. They allow you to isolate messages related to the subject of interest (for example, a consumer product) from the general information flow. Next, evaluate the polarity of the judgment (positive or negative), the degree of emotionality, etc.

Signal processing. A set of techniques borrowed from radio engineering that aims to recognize a signal against a background of noise and its further analysis.

Spatial analysis. A set of methods for analyzing spatial data, partly borrowed from statistics - terrain topology, geographic coordinates, object geometry. Source big data Geographic information systems (GIS) are often used in this case.

Statistics. The science of collecting, organizing, and interpreting data, including developing questionnaires and conducting experiments. Statistical methods are often used to make value judgments about the relationships between certain events.

Supervised learning. A set of techniques based on machine learning technologies that allow you to identify functional relationships in the analyzed data sets.

Simulation. Modeling the behavior of complex systems is often used to forecast, forecast and work through various scenarios in planning.

Time series analysis. A set borrowed from statistics and digital processing signal methods for analyzing data sequences repeated over time. Some obvious applications are tracking the stock market or patient illnesses.

Unsupervised learning. A set of techniques based on machine learning technologies that allow you to identify hidden functional relationships in the analyzed data sets. Has common features with Cluster Analysis.

Visualization. Methods for graphically presenting the results of big data analysis in the form of charts or animated images to simplify interpretation and make the results easier to understand.


Visual representation of the results of big data analysis is of fundamental importance for their interpretation. It is no secret that human perception is limited, and scientists continue to conduct research in the field of improvement modern methods Presenting data in the form of images, charts or animations.

Analytical tools

As of 2011, some of the approaches listed in the previous subsection or a certain combination of them make it possible to implement analytical engines for working with big data in practice. Among the free or relatively inexpensive open Big Data analysis systems we can recommend:

  • Revolution Analytics (based on the R language for mathematical statistics).

Of particular interest on this list is Apache Hadoop, an open source software that has been proven as a data analyzer by most stock trackers over the past five years. As soon as Yahoo opened the Hadoop code to the open source community, a whole movement of creating products based on Hadoop immediately appeared in the IT industry. Almost all modern analysis tools big data provide Hadoop integration tools. Their developers are both startups and well-known global companies.

Markets for Big Data Management Solutions

Big Data Platforms (BDP, Big Data Platform) as a means of combating digital hording

Ability to analyze big data, colloquially called Big Data, is perceived as a benefit, and unambiguously. But is this really so? What could the rampant accumulation of data lead to? Most likely to what domestic psychologists, in relation to humans, call pathological hoarding, syllogomania, or figuratively “Plyushkin syndrome.” In English, the vicious passion to collect everything is called hording (from the English hoard - “stock”). According to the classification of mental illnesses, hording is classified as a mental disorder. In the digital era, digital hoarding is added to the traditional material hording; it can affect both individuals and entire enterprises and organizations ().

World and Russian market

Big data Landscape - Main suppliers

Interest in collection, processing, management and analysis tools big data Almost all leading IT companies showed this, which is quite natural. Firstly, they directly encounter this phenomenon in their own business, and secondly, big data open up excellent opportunities for developing new market niches and attracting new customers.

Many startups have appeared on the market that make business by processing huge amounts of data. Some of them use ready-made cloud infrastructure provided by large players like Amazon.

Theory and practice of Big Data in industries

History of development

2017

TmaxSoft forecast: the next “wave” of Big Data will require modernization of the DBMS

Businesses know that the vast amounts of data they accumulate contain important information about their business and customers. If a company can successfully apply this information, it will have a significant advantage over its competitors and will be able to offer better products and services than theirs. However, many organizations still fail to effectively use big data due to the fact that their legacy IT infrastructure is unable to provide the necessary storage capacity, data exchange processes, utilities and applications required to process and analyze large amounts of unstructured data to extract valuable information from them, TmaxSoft indicated.

Additionally, the increased processing power needed to analyze ever-increasing volumes of data may require significant investment in an organization's legacy IT infrastructure, as well as additional maintenance resources that could be used to develop new applications and services.

On February 5, 2015, the White House released a report that discussed how companies are using " big data» to charge different prices to different customers, a practice known as “price discrimination” or “personalized pricing”. The report describes the benefits of big data for both sellers and buyers, and its authors conclude that many problematic issues issues arising from the advent of big data and differential pricing can be addressed through existing anti-discrimination and consumer protection laws.

The report notes that at this time, there is only anecdotal evidence of how companies are using big data in the context of personalized marketing and differentiated pricing. This information shows that sellers use pricing methods that can be divided into three categories:

  • study of the demand curve;
  • Steering and differentiated pricing based on demographic data; And
  • targeted behavioral marketing (behavioral targeting) and individualized pricing.

Studying the Demand Curve: To determine demand and study consumer behavior, marketers often conduct experiments in this area in which customers are randomly assigned to one of two possible price categories. “Technically, these experiments are a form of differential pricing because they result in different prices for customers, even if they are “non-discriminatory” in the sense that all customers have the same probability of being “sent” to a higher price.”

Steering: It is the practice of presenting products to consumers based on their membership in a specific demographic group. Yes, website computer company may offer the same laptop various types buyers at different prices based on the information they provide about themselves (for example, depending on whether the user is a representative of government agencies, scientific or commercial institutions, or an individual) or based on their geographic location (for example, determined by IP -computer address).

Targeted behavioral marketing and customized pricing: In these cases, customers' personal information is used to target advertising and customize pricing for certain products. For example, online advertisers use data collected by advertising networks and through third-party cookies about online user activity to target their advertisements. This approach, on the one hand, allows consumers to receive advertising of goods and services of interest to them. It may, however, cause concern for those consumers who do not want certain types of their personal data (such as information about visits to websites linked to medical and financial matters) were collected without their consent.

Although targeted behavioral marketing is widespread, there is relatively little evidence of personalized pricing in the online environment. The report speculates that this may be because the methods are still being developed, or because companies are hesitant to use custom pricing (or prefer to keep quiet about it) - perhaps fearing a backlash from consumers.

The report's authors suggest that "for the individual consumer, the use of big data clearly presents both potential rewards and risks." While acknowledging that big data raises transparency and discrimination issues, the report argues that existing anti-discrimination and consumer protection laws are sufficient to address them. However, the report also highlights the need for "ongoing monitoring" where companies use confidential information in a non-transparent manner or in ways that are not covered by the existing regulatory framework.

This report continues the White House's efforts to examine the use of big data and discriminatory pricing on the Internet and the resulting consequences for American consumers. It was previously reported that the White House Big Data Working Group published its report on this issue in May 2014. The Federal Trade Commission (FTC) also addressed these issues during its September 2014 workshop on big data discrimination.

2014

Gartner dispels myths about Big Data

A fall 2014 research note from Gartner lists a number of common Big Data myths among IT leaders and provides rebuttals to them.

  • Everyone is implementing Big Data processing systems faster than us

Interest in Big Data technologies is at an all-time high: 73% of organizations surveyed by Gartner analysts this year are already investing in or planning to do so. But most of these initiatives are still in the very early stages, and only 13% of respondents have already implemented such solutions. The most difficult thing is to determine how to extract income from Big Data, to decide where to start. Many organizations get stuck at the pilot stage because they cannot commit new technology to specific business processes.

  • We have so much data that there is no need to worry about small errors in it

Some IT managers believe that small data flaws do not affect the overall results of analyzing huge volumes. When there is a lot of data, each individual error actually has less of an impact on the result, analysts note, but the errors themselves also become more numerous. In addition, most of the analyzed data is external, of unknown structure or origin, so the likelihood of errors increases. So in the world of Big Data, quality is actually much more important.

  • Big Data technologies will eliminate the need for data integration

Big Data promises the ability to process data in its original format, with automatic schema generation as it is read. It is believed that this will allow information from the same sources to be analyzed using multiple data models. Many believe that this will also enable end users to interpret any data set as they see fit. In reality, most users often want the traditional way with a ready-made schema, where the data is formatted appropriately and there are agreements on the level of integrity of the information and how it should relate to the use case.

  • There is no point in using data warehouses for complex analytics

Many information management system administrators believe that there is no point in spending time creating a data warehouse, given that complex analytical systems rely on new types of data. In fact, many complex analytics systems use information from a data warehouse. In other cases, new types of data need to be additionally prepared for analysis in Big Data processing systems; decisions have to be made about the suitability of the data, the principles of aggregation and the required level of quality - such preparation may occur outside the warehouse.

  • Data warehouses will be replaced by data lakes

In reality, vendors mislead customers by positioning data lakes as a replacement for storage or as critical elements of the analytical infrastructure. Underlying data lake technologies lack the maturity and breadth of functionality found in warehouses. Therefore, managers responsible for data management should wait until lakes reach the same level of development, according to Gartner.

Accenture: 92% of those who implemented big data systems are satisfied with the results

Among the main advantages of big data, respondents named:

  • “searching for new sources of income” (56%),
  • “improving customer experience” (51%),
  • “new products and services” (50%) and
  • “an influx of new customers and maintaining the loyalty of old ones” (47%).

When introducing new technologies, many companies are faced with traditional problems. For 51%, the stumbling block was security, for 47% - the budget, for 41% - the lack of necessary personnel, and for 35% - difficulties in integrating with existing system. Almost all companies surveyed (about 91%) plan to soon solve the problem of staff shortages and hire big data specialists.

Companies are optimistic about the future of big data technologies. 89% believe they will change business as much as the Internet. 79% of respondents noted that companies that do not engage in big data will lose their competitive advantage.

However, respondents disagreed about what exactly should be considered big data. 65% of respondents believe that these are “large data files”, 60% believe that this is “advanced analytics and analysis”, and 50% believe that this is “data visualization tools”.

Madrid spends €14.7 million on big data management

In July 2014, it became known that Madrid would use big data technologies to manage city infrastructure. The cost of the project is 14.7 million euros, the basis of the implemented solutions will be technologies for analyzing and managing big data. With their help, the city administration will manage work with each service provider and pay accordingly depending on the level of services.

We are talking about administration contractors who monitor the condition of streets, lighting, irrigation, green spaces, clean up the territory and remove, as well as waste recycling. During the project, 300 key performance indicators of city services were developed for specially designated inspectors, on the basis of which 1.5 thousand will be carried out daily. various checks and measurements. In addition, the city will begin using an innovative technology platform called Madrid iNTeligente (MiNT) - Smarter Madrid.

2013

Experts: Big Data is in fashion

Without exception, all vendors in the data management market are currently developing technologies for Big Data management. This new technological trend is also actively discussed by the professional community, both developers and industry analysts and potential consumers of such solutions.

As Datashift found out, as of January 2013, there was a wave of discussions around “ big data"exceeded all imaginable dimensions. After analyzing the number of mentions of Big Data on social networks, Datashift calculated that in 2012 the term was used about 2 billion times in posts created by about 1 million different authors around the world. This is equivalent to 260 posts per hour, with a peak of 3,070 mentions per hour.

Gartner: Every second CIO is ready to spend money on Big data

After several years of experimentation with Big data technologies and the first implementations in 2013, the adaptation of such solutions will increase significantly, Gartner predicts. Researchers surveyed IT leaders around the world and found that 42% of respondents have already invested in Big data technologies or plan to make such investments within the next year (data as of March 2013).

Companies are forced to spend money on processing technologies big data, since the information landscape is rapidly changing, requiring new approaches to information processing. Many companies have already realized that large amounts of data are critical, and working with them allows them to achieve benefits that are not available using traditional sources of information and methods of processing it. In addition, the constant discussion of the topic of “big data” in the media fuels interest in relevant technologies.

Frank Buytendijk, a vice president at Gartner, even urged companies to tone down their efforts as some worry they are falling behind competitors in their adoption of Big Data.

“There is no need to worry; the possibilities for implementing ideas based on big data technologies are virtually endless,” he said.

Gartner predicts that by 2015, 20% of Global 1000 companies will have a strategic focus on “information infrastructure.”

In anticipation of the new opportunities that big data processing technologies will bring, many organizations are already organizing the process of collecting and storing various types of information.

For educational and government organizations, as well as industrial companies, the greatest potential for business transformation lies in the combination of accumulated data with so-called dark data (literally “dark data”), the latter includes messages Email, multimedia and other similar content. According to Gartner, the winners in the data race will be those who learn to handle the most different sources information.

Cisco survey: Big Data will help increase IT budgets

The Spring 2013 Cisco Connected World Technology Report, conducted in 18 countries by independent research firm InsightExpress, surveyed 1,800 college students and an equal number of young professionals between the ages of 18 and 30. The survey was conducted to find out the level of readiness of IT departments to implement projects Big Data and gain insight into the challenges involved, technological shortcomings and strategic value of such projects.

Most companies collect, record and analyze data. However, the report says, many companies face a range of complex business and information technology challenges with Big Data. For example, 60 percent of respondents admit that Big Data solutions can improve decision-making processes and increase competitiveness, but only 28 percent said that they are already receiving real strategic benefits from the accumulated information.

More than half of the IT executives surveyed believe that Big Data projects will help increase IT budgets in their organizations, as there will be increased demands on technology, personnel and professional skills. At the same time, more than half of respondents expect that such projects will increase IT budgets in their companies as early as 2012. 57 percent are confident that Big Data will increase their budgets over the next three years.

81 percent of respondents said that all (or at least some) Big Data projects will require the use of cloud computing. Thus, the spread of cloud technologies may affect the speed of adoption of Big Data solutions and the business value of these solutions.

Companies collect and use many different types of data, both structured and unstructured. Here are the sources from which survey participants receive their data (Cisco Connected World Technology Report):

Nearly half (48 percent) of IT leaders predict the load on their networks will double over the next two years. (This is especially true in China, where 68 percent of respondents share this view, and in Germany – 60 percent). 23 percent of respondents expect network load to triple over the next two years. At the same time, only 40 percent of respondents declared their readiness for explosive growth in network traffic volumes.

27 percent of respondents admitted that they need better IT policies and information security measures.

21 percent need more bandwidth.

Big Data opens up new opportunities for IT departments to add value and build strong relationships with business units, allowing them to increase revenue and strengthen the company's financial position. Big Data projects make IT departments a strategic partner to business departments.

According to 73 percent of respondents, the IT department will become the main driver of the implementation of the Big Data strategy. At the same time, respondents believe that other departments will also be involved in the implementation of this strategy. First of all, this concerns the departments of finance (named by 24 percent of respondents), research and development (20 percent), operations (20 percent), engineering (19 percent), as well as marketing (15 percent) and sales (14 percent).

Gartner: Millions of new jobs needed to manage big data

Global IT spending will reach $3.7 billion by 2013, which is 3.8% more than spending on information technology in 2012 (year-end forecast is $3.6 billion). Segment big data(big data) will develop at a much faster pace, says a Gartner report.

By 2015, 4.4 million jobs in information technology will be created to service big data, of which 1.9 million jobs will be in . Moreover, each workplace will entail the creation of three additional jobs outside of the IT sector, so that in the United States alone in the next four years 6 million people will work to support the information economy.

According to Gartner experts, the main problem The problem is that there is not enough talent in the industry for this: both the private and public educational systems, for example in the USA, are not able to supply the industry with a sufficient number of qualified personnel. So of the new IT jobs mentioned, only one out of three will be staffed.

Analysts believe that the role of nurturing qualified IT personnel should be taken directly by companies that urgently need them, since such employees will be their ticket to the new information economy of the future.

2012

The first skepticism regarding "Big Data"

Analysts from Ovum and Gartner suggest that for a fashionable topic in 2012 big data the time may come to liberate yourself from illusions.

The term “Big Data” at this time usually refers to the constantly growing volume of information entering the operational mode from social media, sensor networks and other sources, as well as a growing range of tools used to process data and identify important business trends from it.

“Because of (or despite) the hype around the idea of ​​big data, manufacturers in 2012 looked at this trend with great hope,” said Tony Bayer, an analyst at Ovum.

Bayer reported that DataSift conducted a retrospective analysis of big data mentions in

Moscow_Exchange May 6, 2015 at 8:38 pm

Analytical review of the Big Data market

  • Moscow Exchange company blog,
  • Big Data

"Big Data" is a topic that is actively discussed by technology companies. Some of them have become disillusioned with big data, while others, on the contrary, are making the most of it for business... A fresh analytical review of the domestic and global Big Data market, prepared by the Moscow Exchange together with IPOboard analysts, shows which trends are most relevant in the market now . We hope the information will be interesting and useful.

WHAT IS BIG DATA?

Key Features
Big Data is currently one of the key drivers of information technology development. This direction, relatively new for Russian business, has become widespread in Western countries. This is due to the fact that in the era of information technology, especially after the boom of social networks, a significant amount of information began to accumulate for each Internet user, which ultimately gave rise to the development of Big Data.

The term “Big Data” causes a lot of controversy; many believe that it only means the amount of accumulated information, but we should not forget about the technical side; this area includes storage technologies, computing, and services.

It should be noted that this area includes the processing of a large amount of information, which is difficult to process using traditional methods*.

Below is a comparison table between traditional and Big Data databases.

The field of Big Data is characterized by the following features:
Volume – volume, the accumulated database represents a large amount of information, which is labor-intensive to process and store in traditional ways, they require new approach and improved tools.
Velocity – speed, this attribute indicates both the increasing speed of data accumulation (90% of information was collected over the last 2 years) and the speed of data processing; real-time data processing technologies have recently become more in demand.
Variety – diversity, i.e. the ability to simultaneously process structured and unstructured information of various formats. The main difference between structured information is that it can be classified. An example of such information would be information about customer transactions.
Unstructured information includes video, audio files, free text, information coming from social networks. Today, 80% of information is unstructured. This information needs complex analysis to make it useful for further processing.
Veracity – reliability of data, users began to attach increasing importance to the reliability of available data. Thus, Internet companies have a problem in separating the actions carried out by a robot and a person on the company’s website, which ultimately leads to difficulties in data analysis.
Value – the value of the accumulated information. Big Data must be useful to the company and bring some value to it. For example, help in improving business processes, reporting or optimizing costs.

If the above 5 conditions are met, the accumulated volumes of data can be classified as large.

Areas of application of Big Data

The scope of use of Big Data technologies is extensive. Thus, with the help of Big Data, you can learn about customer preferences, the effectiveness of marketing campaigns, or conduct risk analysis. Below are the results of a survey by the IBM Institute on the areas of use of Big Data in companies.

As can be seen from the diagram, most companies use Big Data in the field of customer service, the second most popular area is operational efficiency; in the field of risk management, Big Data is less common at the moment.

It should also be noted that Big Data is one of the fastest growing areas of information technology; according to statistics, the total amount of data received and stored doubles every 1.2 years.
Between 2012 and 2014, the amount of data transferred monthly by mobile networks increased by 81%. According to Cisco estimates, in 2014 the volume of mobile traffic was 2.5 exabytes (a unit of measurement of the amount of information equal to 10^18 standard bytes) per month, and in 2019 it will be equal to 24.3 exabytes.
Thus, Big Data is an already established area of ​​technology, even despite its relatively young age, which has become widespread in many areas of business and plays an important role in the development of companies.

Big Data Technologies
Technologies used for collecting and processing Big Data can be divided into 3 groups:
  • Software;
  • Equipment;
  • Services.

The most common data processing (DP) approaches include:
SQL - language structured queries, allowing you to work with databases. WITH using SQL Data can be created and modified, and the data array is managed by an appropriate database management system.
NoSQL – the term stands for Not Only SQL (not only SQL). Includes a number of approaches aimed at implementing a database that differ from the models used in traditional, relational DBMS. They are convenient to use when the data structure is constantly changing. For example, to collect and store information on social networks.
MapReduce – calculation distribution model. Is used for parallel computing over very large data sets (petabytes* or more). In a program interface, it is not the data that is transferred to the program for processing, but the program to the data. Thus, the request is a separate program. The principle of operation is to sequentially process data using two methods: Map and Reduce. Map selects preliminary data, Reduce aggregates it.
Hadoop – used to implement search and contextual mechanisms for high-load sites - Facebook, eBay, Amazon, etc. A distinctive feature is that the system is protected from failure of any of the cluster nodes, since each block has at least one copy of the data on another node.
SAP HANA – high-performance NewSQL platform for data storage and processing. Provides high speed processing requests. Another distinctive feature is that SAP HANA simplifies the system landscape, reducing the cost of supporting analytical systems.

Technological equipment includes:

  • servers;
  • infrastructure equipment.
Servers include data storage.
Infrastructure equipment includes platform acceleration tools, uninterruptible power supplies, server console sets, etc.

Services.
Services include services for building the architecture of a database system, arranging and optimizing the infrastructure and ensuring the security of data storage.

Software, hardware, and services together form comprehensive platforms for data storage and analysis. Companies such as Microsoft, HP, EMC offer services for the development, deployment and management of Big Data solutions.

Applications in industries
Big Data has become widespread in many business sectors. They are used in healthcare, telecommunications, trade, logistics, financial companies, as well as in government administration.
Below are some examples of Big Data applications in some of the industries.

Retail
The databases of retail stores can accumulate a lot of information about customers, inventory management systems, and supplies of commercial products. This information can be useful in all areas of store activity.

Thus, with the help of accumulated information, you can manage the supply of goods, their storage and sale. Based on the accumulated information, it is possible to predict the demand and supply of goods. Also, a data processing and analysis system can solve other problems of a retailer, for example, optimizing costs or preparing reporting.

Financial services
Big Data makes it possible to analyze the borrower’s creditworthiness and is also useful for credit scoring* and underwriting**. The introduction of Big Data technologies will reduce the time for reviewing loan applications. With the help of Big Data, it is possible to analyze the transactions of a specific client and offer banking services that are suitable for him.

Telecom
In the telecommunications industry, Big Data has become widespread among mobile operators.
Cellular operators, along with financial institutions, have some of the most voluminous databases, which allows them to conduct the most in-depth analysis of accumulated information.
The main purpose of data analysis is to retain existing clients and attracting new ones. To do this, companies segment customers, analyze their traffic, and determine the social affiliation of the subscriber.

In addition to using Big Data for marketing purposes, technologies are used to prevent fraudulent financial transactions.

Mining and petroleum industries
Big Data is used both in the extraction of minerals and in their processing and marketing. Based on the information received, enterprises can draw conclusions about the efficiency of field development, track the schedule for major repairs and the condition of equipment, and forecast demand for products and prices.

According to a survey by Tech Pro Research, Big Data is most widespread in the telecommunications industry, as well as in engineering, IT, financial and government enterprises. According to the results of this survey, Big Data is less popular in education and healthcare. The survey results are presented below:

Examples of using Big Data in companies
Today, Big Data is being actively implemented in foreign companies. Companies such as Nasdaq, Facebook, Google, IBM, VISA, Master Card, Bank of America, HSBC, AT&T, Coca Cola, Starbucks and Netflix are already using Big Data resources.

The applications of the processed information are varied and vary depending on the industry and the tasks that need to be performed.
Next, examples of the application of Big Data technologies in practice will be presented.

HSBC uses Big Data technologies to combat fraudulent transactions with plastic cards. With the help of Big Data, the company increased the efficiency of the security service by 3 times, and the recognition of fraudulent incidents by 10 times. The economic effect from the introduction of these technologies exceeded $10 million.

Antifraud* VISA allows you to automatically identify fraudulent transactions; the system currently helps prevent fraudulent payments amounting to $2 billion annually.

Watson supercomputer IBM analyzes in real time the flow of data on monetary transactions. According to IBM, Watson increased the number of fraudulent transactions detected by 15%, reduced false positives by 50% and increased the amount of money protected from transactions of this nature by 60%.

Procter & Gamble using Big Data to design new products and create global marketing campaigns. P&G has created dedicated Business Spheres offices where information can be viewed in real time.
Thus, the company's management had the opportunity to instantly test hypotheses and conduct experiments. P&G believes that Big Data helps in forecasting company performance.

Office supplies retailer OfficeMax Using Big Data technologies, they analyze customer behavior. Big Data analysis made it possible to increase B2B revenue by 13% and reduce costs by $400,000 per year.

According to Caterpillar , its distributors miss out on $9 to $18 billion in profits each year simply because they do not implement Big Data processing technologies. Big Data would allow customers to manage their fleet more efficiently by analyzing information coming from sensors installed on the machines.

Today it is already possible to analyze the condition of key components, their degree of wear, and manage fuel and maintenance costs.

Luxottica group is a manufacturer of sports glasses, such brands as Ray-Ban, Persol and Oakley. The company uses Big Data technologies to analyze the behavior of potential customers and “smart” SMS marketing. As a result of Big Data, Luxottica group identified more than 100 million of its most valuable customers and increased the effectiveness of its marketing campaign by 10%.

With the help of Yandex Data Factory, the game developers World of Tanks analyze the behavior of players. Big Data technologies made it possible to analyze the behavior of 100 thousand World of Tanks players using more than 100 parameters (information about purchases, games, experience, etc.). As a result of the analysis, a forecast of user outflow was obtained. This information allows you to reduce user departure and work with game participants in a targeted manner. The developed model turned out to be 20-30% more effective than standard gaming industry analysis tools.

German Ministry of Labor uses Big Data in work related to the analysis of incoming applications for unemployment benefits. So, after analyzing the information, it became clear that 20% of benefits were paid undeservedly. With the help of Big Data, the Ministry of Labor reduced costs by 10 billion euros.

Toronto Children's Hospital implemented Project Artemis. This is an information system that collects and analyzes data on babies in real time. The system monitors 1260 indicators of each child’s condition every second. Project Artemis makes it possible to predict the unstable condition of a child and begin the prevention of diseases in children.

OVERVIEW OF THE WORLD BIG DATA MARKET

Current state of the world market
In 2014, Big Data, according to the Data Collective, became one of the priority investment areas in the venture industry. According to the Computerra information portal, this is due to the fact that developments in this area have begun to bring significant results for their users. Over the past year, the number of companies with implemented projects in the field of big data management has increased by 125%, and the market volume has grown by 45% compared to 2013.

The majority of Big Data market revenue, according to Wikibon, in 2014 was made up of services, their share was equal to 40% of total revenue (see chart below):

If we consider Big Data for 2014 by subtype, the market will look like this:

According to Wikibon, applications and analytics accounted for 36% of Big Data revenue in 2014 from Big Data applications and analytics, 17% from computing equipment and 15% from data storage technologies. The least amount of revenue was generated by NoSQL technologies, infrastructure equipment and provision of a network of companies ( corporate networks).

The most popular Big Data technologies are the in-memory platforms of SAP, HANA, Oracle, etc. The results of the T-Systems survey showed that they were chosen by 30% of the companies surveyed. The second most popular were NoSQL platforms (18% of users), companies also used analytical platforms from Splunk and Dell, they were chosen by 15% of companies. According to the survey results, Hadoop/MapReduce products turned out to be the least useful for solving Big Data problems.

According to an Accenture survey, in more than 50% of companies using Big Data technologies, Big Data costs range from 21% to 30%.
According to the following Accenture analysis, 76% of companies believe that these costs will increase in 2015, and 24% of companies will not change their budget for Big Data technologies. This suggests that in these companies Big Data has become an established area of ​​IT, which has become an integral part of the company’s development.

The results of the Economist Intelligence Unit survey confirm the positive effect of implementing Big Data. 46% of companies say that using Big Data technologies they have improved customer service by more than 10%, 33% of companies have optimized inventory and improved the productivity of fixed assets, and 32% of companies have improved planning processes.

Big Data in different countries of the world
Today, Big Data technologies are most often implemented in US companies, but other countries around the world have already begun to show interest. In 2014, according to IDC, countries in Europe, the Middle East, Asia (excluding Japan) and Africa accounted for 45% of the market for software, services and equipment in the field of Big Data.

Also, according to the CIO survey, companies from the Asia-Pacific region are rapidly adopting new solutions in the field of Big Data analysis, secure storage and cloud technologies. Latin America is in second place in terms of the number of investments in the development of Big Data technologies, ahead of European countries and the USA.
Next, a description and forecasts for the development of the Big Data market in several countries will be presented.

China
The volume of information in China is 909 exabytes, which is equal to 10% of the total volume of information in the world, by 2020 the volume of information will reach 8060 exabytes, the share of information in global statistics will also increase, in 5 years it will be equal to 18%. The potential growth of China's Big Data has one of the fastest growing dynamics.

Brazil
At the end of 2014, Brazil accumulated information worth 212 exabytes, which is 3% of the global volume. By 2020, the volume of information will grow to 1600 exabytes, which will account for 4% of the world's information.

India
According to EMC, the volume of accumulated data in India at the end of 2014 is 326 exabytes, which is 5% of the total volume of information. By 2020, the volume of information will grow to 2800 exabytes, which will account for 6% of the world's information.

Japan
The volume of accumulated data in Japan at the end of 2014 is 495 exabytes, which is 8% of the total volume of information. By 2020, the volume of information will grow to 2,200 exabytes, but Japan’s market share will decrease and amount to 5% of the total volume of information in the whole world.
Thus, the Japanese market size will decrease by more than 30%.

Germany
According to EMC, the volume of accumulated data in Germany at the end of 2014 is 230 exabytes, which is 4% of the total volume of information in the world. By 2020, the volume of information will grow to 1100 exabytes and amount to 2%.
In the German market, a large share of revenue, according to Experton Group forecasts, will be generated by the services segment, the share of which in 2015 will be 54%, and in 2019 will increase to 59%; the shares of software and hardware, on the contrary, will decrease.

Overall, the market size will grow from 1.345 billion euros in 2015 to 3.198 billion euros in 2019, an average growth rate of 24%.
Thus, based on the analytics of CIO and EMC, we can conclude that the developing countries of the world will become markets in the coming years active development Big Data technologies.

Main market trends
According to IDG Enterprise, in 2015, companies' spending on Big Data will average $7.4 million per company, large companies intend to spend approximately $13.8 million, small and medium-sized companies - $1.6 million .
Most of the investment will be in areas such as data analysis, visualization and data collection.
Based on current trends and market demand, investments in 2015 will be used to improve data quality, improve planning and forecasting, and increase data processing speed.
Companies in the financial sector, according to Bain Company’s Insights Analysis, will make significant investments, so in 2015 they plan to spend $6.4 billion on Big Data technologies, the average growth rate of investments will be 22% until 2020. Internet companies plan to spend $2.8 billion, with an average growth rate of 26% for Big Data spending.
During the Economist Intelligence Unit survey, priority areas development of Big Data in 2014 and in the next 3 years, the distribution of answers is as follows:

According to IDC forecasts, market development trends are as follows:

  • In the next 5 years, costs for cloud solutions in the field of Big Data technologies will grow 3 times faster than costs for local solutions. Hybrid platforms for data storage will become in demand.
  • The growth of applications using sophisticated and predictive analytics, including machine learning, will accelerate in 2015, with the market for such applications growing 65% faster than applications that do not use predictive analytics.
  • Media analytics will triple in 2015 and will become a key driver of growth in the Big Data technology market.
  • The trend of introducing solutions for analyzing the constant flow of information that is applicable to the Internet of Things will accelerate.
  • By 2018, 50% of users will interact with services based on cognitive computing.
Market Drivers and Limiters
IDC experts identified 3 drivers of the Big Data market in 2015:

According to an Accenture survey, data security issues are now the main barrier to the implementation of Big Data technologies, with more than 51% of respondents confirming that they are worried about ensuring data protection and confidentiality. 47% of companies reported the impossibility of implementing Big Data due to limited budgets, 41% of companies indicated a lack of qualified personnel as a problem.

Wikibon predicts that the Big Data market will grow to $38.4 billion in 2015, up 36% year-on-year. In the coming years, there will be a decline in growth rates to 10% in 2017. Taking into account these forecasts, the market size in 2020 will be equal to 68.7 billion US dollars.

The distribution of the global Big Data market by business category will look like this:

As can be seen from the diagram, the majority of the market will be occupied by technologies in the field of improving customer service. Targeted marketing will be the second priority for companies until 2019; in 2020, according to Heavy Reading, it will give way to solutions to improve operational efficiency.
The segment “improving customer service” will also have the highest growth rate, with an increase of 49% annually.
The market forecast for Big Data subtypes will look like this:

The predominant market share, as can be seen from the diagram, is occupied by professional services, the highest growth rate will be in applications with analytics, their share will increase from the current 12% to 18% in 2020 and the volume of this segment will be equal to 12.3 billion US dollars. the share of computing equipment, on the contrary, will fall from 20% to 14% and amount to about 9.3 billion US dollars in 2020, the market for cloud technologies will gradually increase and in 2020 will reach 6.3 billion US dollars, the market share of solutions for data storage, on the contrary, will decrease from 15% in 2014 to 13% in 2020 and in monetary terms will be equal to 8.9 billion US dollars.
According to Bain & Company’s Insights Analysis forecast, the distribution of the Big Data market by industry in 2020 will be as follows:

  • The financial industry will spend $6.4 billion on Big Data with an average growth rate of 22% per year;
  • Internet companies will spend $2.8 billion and the average cost growth rate will be 26% over the next 5 years;
  • Public sector costs will be commensurate with the costs of Internet companies, but the growth rate will be lower - 22%;
  • The telecommunications sector will grow at a CAGR of 40% to reach US$1.2 billion in 2020;

Energy companies will invest a relatively small amount in these technologies - $800 million, but the growth rate will be one of the highest - 54% annually.
Thus, the largest share of the Big Data market in 2020 will be taken by companies in the financial industry, and the fastest growing sector will be energy.
Following analysts' forecasts, the total market size will increase in the coming years. Market growth will be achieved through the implementation of Big Data technologies in developing countries of the world, as can be seen from the graph below.

The projected market size will depend on how developing countries perceive Big Data technologies and whether they will be as popular as in developed countries. In 2014, developing countries of the world accounted for 40% of the volume of accumulated information. According to EMC's forecast, the current market structure, with a predominance of developed countries, will change in 2017. According to EMC analytics, in 2020 the share of developing countries will be more than 60%.
According to Cisco and EMC, developing countries around the world will work quite actively with Big Data, largely due to the availability of technology and the accumulation of a sufficient amount of information to the Big Data level. The world map presented on the next page will show the forecast for the increase in volume and growth rate of Big Data by region.

ANALYSIS OF THE RUSSIAN MARKET

Current state of the Russian market

According to the results of a study by CNews Analytics and Oracle, the level of maturity of the Russian Big Data market has increased over the past year. Respondents, representing 108 large enterprises from various industries, demonstrated a higher degree of awareness of these technologies, as well as an established understanding of the potential of such solutions for their business.
As of 2014, according to IDC, Russia has accumulated 155 exabytes of information, which is only 1.8% of the world's data. The volume of information by 2020 will reach 980 exabytes and occupy 2.2%. Thus, the average growth rate of information volume will be 36% per year.
IDC estimates the Russian market at $340 million, of which $100 million are SAP solutions, approximately $240 million are similar solutions from Oracle, IBM, SAS, Microsoft, etc.
The growth rate of the Russian Big Data market is no less than 50% per year.
It is predicted that positive dynamics will continue in this sector of the Russian IT market, even in conditions of general economic stagnation. This is due to the fact that businesses continue to demand solutions that improve operational efficiency, as well as optimize costs, improve forecasting accuracy and minimize possible company risks.
The main service providers in the field of Big Data on the Russian market are:
  • Oracle
  • Microsoft
  • Cloudera
  • Hortonworks
  • Teradata.
Market overview by industry and experience in using Big Data in companies
According to CNews, in Russia only 10% of companies have begun to use Big Data technologies, when in the world the share of such companies is about 30%. Readiness for Big Data projects is growing in many sectors of the Russian economy, according to a report from CNews Analytics and Oracle. More than a third of the surveyed companies (37%) have started working with Big Data technologies, of which 20% are already using such solutions, and 17% are starting to experiment with them. The second third of respondents in currently are considering this possibility.

In Russia, Big Data technologies are most popular in the banking and telecom sectors, but they are also in demand in the mining industry, energy, retail, logistics companies and the public sector.
Next, examples of the use of Big Data in Russian realities will be considered.

Telecom
Telecom operators have some of the most voluminous databases, which allows them to conduct the most in-depth analysis of accumulated information.
One of the areas of application of Big Data technology is subscriber loyalty management.
The main purpose of data analysis is to retain existing customers and attract new ones. To do this, companies segment customers, analyze their traffic, and determine the social affiliation of the subscriber. In addition to using information for marketing purposes, telecom technologies are used to prevent fraudulent financial transactions.
One of the striking examples of this industry is VimpelCom. The company uses Big Data to improve the quality of service at the level of each subscriber, compile reports, analyze data for network development, combat spam and personalize services.

Banks
A significant proportion of Big Data users are specialists from the financial industry. One of the successful experiments was carried out at the Ural Bank for Reconstruction and Development, where the information base began to be used to analyze clients, the bank began to offer specialized loan offers, deposits and other services. Within a year of using these technologies, the company's retail loan portfolio grew by 55%.
Alfa-Bank analyzes information from social networks, processes loan applications, and analyzes the behavior of users of the company’s website.
Sberbank also began processing a massive amount of data to segment clients, prevent fraudulent activities, cross-sell, and manage risks. In the future, it is planned to improve the service and analyze customer actions in real time.
The All-Russian Regional Development Bank analyzes the behavior of plastic card holders. This makes it possible to identify transactions that are atypical for a particular client, thereby increasing the likelihood of detecting theft of funds from plastic cards.

Retail
In Russia, Big Data technologies have been implemented by both online and offline trading companies. Today, according to CNews Analytics, Big Data is used by 20% of retailers. 75% of retail professionals consider Big Data necessary for the development of a competitive company promotion strategy. According to Hadoop statistics, after the implementation of Big Data technology, profits in trading organizations increase by 7-10%.
M.Video specialists talk about improved logistics planning after the implementation of SAP HANA; also, as a result of its implementation, the preparation of annual reports was reduced from 10 days to 3, the speed of daily data loading was reduced from 3 hours to 30 minutes.
Wikimart uses these technologies to generate recommendations for site visitors.
One of the first offline stores to introduce Big Data analysis in Russia was Lenta. With the help of Big Data, retail began to study information about customers from cash register receipts. The retailer collects information to create behavioral models, which makes it possible to make more informed decisions at the operational and commercial level.

Oil and gas industry
In this industry, the scope of Big Data is quite wide. Big Data technologies can be used in the extraction of minerals from the subsoil. With their help, you can analyze the extraction process itself and the most effective ways to extract it, monitor the drilling process, analyze the quality of raw materials, as well as the processing and marketing of the final product. In Russia, Transneft and Rosneft have already begun to use these technologies.

Government bodies
In countries such as Germany, Australia, Spain, Japan, Brazil and Pakistan, Big Data technologies are used to solve national issues. These technologies help government authorities more effectively provide services to the population and provide targeted social support.
In Russia, these technologies began to be mastered by such government bodies, How Pension Fund, Federal Tax Service and Compulsory Health Insurance Fund. The potential for implementing projects using Big Data is great; these technologies could help improve the quality of services, and, as a result, the standard of living of the population.

Logistics and transport
Big Data can also be used by transport companies. Using Big Data technologies, you can track your car fleet, take into account fuel costs, and monitor customer requests.
Russian Railways implemented Big Data technologies together with SAP. These technologies helped reduce the reporting preparation time by 43.5 times (from 14.5 hours to 20 minutes), and increase the accuracy of cost distribution by 40 times. Big Data was also introduced into planning and tariff regulation processes. In total, the companies use more than 300 systems based on SAP solutions, 4 data centers are involved, and the number of users is 220,000.

Main drivers and limiters of the market
The drivers for the development of Big Data technologies in the Russian market are:
  • Increased interest on the part of users in the capabilities of Big Data as a way to increase the competitiveness of a company;
  • Development of methods for processing media files at a global level;
  • Transfer of servers processing personal information to the territory of Russia, in accordance with the adopted law on the storage and processing of personal data;
  • Implementation of the industry plan for import substitution of software. This plan includes government support for domestic software manufacturers, as well as the provision of preferences for domestic IT products when purchasing at public expense.
  • In the new economic situation, when the dollar exchange rate has almost doubled, there will be a trend towards an increasing use of the services of Russian providers cloud services than foreign ones.
  • Creation of technology parks that contribute to the development of the information technology market, including the Big Data market;
  • State program for the implementation of grid systems based on Big Data technologies.

The main barriers to the development of Big Data in the Russian market are:

  • Ensuring data security and confidentiality;
  • Lack of qualified personnel;
  • Insufficiency of accumulated information resources to the Big Data level in most Russian companies;
  • Difficulties in introducing new technologies into established information systems of companies;
  • The high cost of Big Data technologies, which leads to a limited number of enterprises that have the opportunity to implement these technologies;
  • Political and economic uncertainty, which led to the outflow of capital and the freezing of investment projects in Russia;
  • Rising prices for imported products and a surge in inflation, according to IDC, are slowing down the development of the entire IT market.
Russian market forecast
As of today, the Russian Big Data market is not as popular as in developed countries. Most Russian companies show interest in it, but do not dare to take advantage of their opportunities.
Examples of large companies that have already benefited from the use of Big Data technologies are increasing awareness of the capabilities of these technologies.
Analysts also have quite optimistic forecasts regarding the Russian market. IDC believes that the Russian market share will increase over the next 5 years, unlike the German and Japanese markets.
By 2020, the volume of Big Data in Russia will grow from the current 1.8% to 2.2% of the global data volume. The amount of information will grow, according to EMC, from the current 155 exabytes to 980 exabytes in 2020.
At the moment, Russia continues to accumulate the volume of information to the level of Big Data.
According to a CNews Analytics survey, 44% of surveyed companies work with data of no more than 100 terabytes* and only 13% work with volumes above 500 terabytes.

Nevertheless, the Russian market, following global trends, will increase. As of 2014, IDC estimates the market size at $340 million.
The market growth rate in previous years was 50% per year; if it remains at the same level, then in 2018 the market volume will reach $1.7 billion. The share of the Russian market in the world market will be about 3%, increasing from the current 1.2%.

The most receptive industries to the use of Big Data in Russia include:

  • Retail and banks, analysis is most important for them customer base, evaluation of the effect of marketing campaigns;
  • Telecom – customer base segmentation and traffic monetization;
  • Public sector – reporting, analysis of applications from the public, etc.;
  • Oil companies – monitoring of work and planning of production and sales;
  • Energy companies – creation of intelligent electric power systems, operational monitoring and forecasting.
In developed countries, Big Data has become widespread in the fields of healthcare, insurance, metallurgy, Internet companies and manufacturing enterprises; most likely, in the near future, Russian companies from these areas will also appreciate the effect of introducing Big Data and will adapt these technologies in their industries.
In Russia, as well as in the world, in the near future there will be a trend towards data visualization, analysis of media files and the development of the Internet of things.
Despite the general stagnation of the economy, in the coming years, analysts predict further growth of the Big Data market, primarily due to the fact that the use of Big Data technologies gives its users a competitive advantage in terms of increasing the operational efficiency of the business, attracting additional flow of customers, minimizing risks and implementation of data forecasting technologies.
Thus, we can conclude that the Big Data segment in Russia is at the formation stage, but the demand for these technologies is increasing every year.

Main results of the market analysis

World market
At the end of 2014, the Big Data market is characterized by the following parameters:
  • market volume amounted to 28.5 billion US dollars, an increase of 45% compared to the previous year;
  • the majority of Big Data market revenue came from services, their share was equal to 40% of total revenue;
  • 36% of revenue came from Big Data applications and analytics, 17% from computing equipment and 15% from data storage technologies;
  • The most popular for solving Big Data problems are in-memory platforms from companies such as SAP, HANA and Oracle.
  • the number of companies with implemented projects in the field of Big Data management increased by 125%;
The market forecast for the next years is as follows:
  • in 2015 the market volume will reach 38.4 billion US dollars, in 2020 – 68.7 billion US dollars;
  • the average growth rate will be 16% annually;
  • average company costs for Big Data technologies will be $13.8 million for large companies and $1.6 million for small and medium-sized businesses;
  • technologies will be most widespread in the areas of customer service and targeted marketing;
  • In 2017, the global market structure will change towards the predominance of user companies from developing countries.
Russian market
The Russian Big Data market is at the stage of formation, the results of 2014 are as follows:
  • market volume reached USD 340 million;
  • the average market growth rate in previous years was 50% annually;
  • the total volume of accumulated information was 155 exabytes;
  • 10% of Russian companies began to use Big Data technologies;
  • Big Data technologies were more popular in the banking sector, telecoms, Internet companies and retail.
The Russian market forecast for the coming years is as follows:
  • the volume of the Russian market in 2015 will reach 500 million US dollars, and in 2018 – 1.7 billion US dollars;
  • the share of the Russian market in the global market will be about 3% in 2018;
  • the amount of accumulated data in 2020 will be 980 exabytes;
  • data volume will grow to 2.2% of global data volume in 2020;
  • Technologies of data visualization, media file analysis and the Internet of things will become most popular.
Based on the results of the analysis, we can conclude that the Big Data market is still in the early stages of development, and in the near future we will see its growth and expansion of the capabilities of these technologies.

Thank you for taking the time to read this voluminous work, subscribe to our blog - we promise many new interesting publications!

Big data, or big data, is a concept used in information technology and the field of marketing. The term “big data” is used to define the analysis and management of large volumes. Thus, big data is information that, due to its large volumes, cannot be processed in traditional ways.

Modern life impossible to imagine without digital technology. The world's data warehouses are constantly being replenished, and therefore it is also necessary to continuously change both the conditions for storing information and look for new ways to increase the volume of its media. Based on expert opinion, the increase big data and increasing growth rates are current realities. As already mentioned, information appears non-stop. Huge volumes of it are generated by information sites, various services file sharing and social networks, but this is only a small part of the total volume produced.

IDC Digital Universe, after conducting a study, stated that within 5 years the volume of data on the entire Earth will reach forty zettabytes. This means that for every person on the planet there will be 5200 GB of information.

Best article of the month

The first half of 2018 is ending - it’s time to sum up the interim results. Even if the company's commercial performance has increased compared to the previous period, make sure that there are no hidden difficulties in the company's work that could cause trouble.

To diagnose problems, fill out the checklists from our article and find out which side of the business to pay attention to.

It is common knowledge that people are not the main producers of information. The main source that brings information data are robots that continuously interact. These include the operating system of computers, tablets and mobile phones, intelligent systems, monitoring tools, surveillance systems, etc. Together, they set a rapid rate of increase in the amount of data, which means that the need to create both real and virtual servers is increasing. Taken together, this leads to the expansion and implementation of new data centers.

Most often, big data is defined as information that exceeds the volume of a PC's hard drive and cannot be processed by traditional methods that are used to process and analyze information with a smaller volume.

To summarize, big data processing technology ultimately comes down to 3 main areas, which, in turn, solve 3 types of problems:

  1. Storing and managing huge volumes of data - up to hundreds of terabytes and petabytes in size - that relational databases cannot effectively use.
  2. Organization of unstructured information - texts, images, videos and other types of data.
  3. Big data analysis (big data analytics) - this covers ways of working with unstructured information, creating analytical data reports, and introducing predictive models.

Market of projectsbigdata is closely interconnected with the VA market - business analytics, the volume of which in 2012 amounted to about $100 billion, and includes network technologies, software, technical services and servers.

Automation of company activities, in particular income assurance (RA) solutions, is also inextricably linked with the use of big data technologies. Today, systems in this area contain tools that are used to detect inconsistencies and for in-depth data analysis, and also help identify possible losses or inaccuracies in information that could lead to a decrease in the results of the sector.

Russian companies confirm that there is a demand for big data technologies; they separately note that the main factors influencing the development of big data in Russia are an increase in the volume of data, rapid adoption of management decisions and an increase in their quality.

What role does big data play in marketing?

It's no secret that information is one of the main components of successful forecasting and development of a marketing strategy, if you know how to use it.

Big data analysis is indispensable in determining the target audience, its interests and activity. In other words, the skillful use of big data allows you to accurately predict the development of a company.

Using, for example, the well-known RTB auction model, with the help of big data analysis it is easy to make sure that advertising is displayed only to those potential buyers who are interested in purchasing a service or product.

Application big data in marketing:

  1. Allows you to recognize potential buyers and attract the appropriate audience on the Internet.
  2. Helps assess satisfaction.
  3. Helps match the service offered to the needs of the buyer.
  4. Facilitates the search and implementation of new methods to increase customer loyalty.
  5. Simplifies the creation of projects that will subsequently be in demand.

A particular example is the Google.trends service. With its help, a marketer will be able to identify the forecast for the season regarding a particular product, the geography of clicks and fluctuations. Thus, by comparing the information received with the statistics of your own website, it is quite easy to draw up an advertising budget indicating the region and month.

  • Distribution of advertising budget: what is worth spending on
  • l>

    How and where to store big data big data

    File system- this is where big data is organized and stored. All information is located on a large number of hard drives on the PC.

    "Map"- map - keeps track of where each piece of information is directly stored.

    In order to insure against unforeseen circumstances, it is customary to save each piece of information several times - it is recommended to do this three times.

    For example, after collecting individual transactions in a retail network, all information about each individual transaction will be stored on multiple servers and hard drives, and the “map” will index the file location for each specific transaction.

    In order to organize data storage in large volumes, you can use standard technical equipment and publicly available software (for example, Hadoop).

    Big data and business analytics: the difference between concepts

    Today, business analysis is a descriptive process of results that were achieved over a specific time period. The current speed of processing big data makes the analysis predictive. You can rely on his recommendations in the future. Big data technologies make it possible to analyze a larger number of types of data compared to the tools and tools used in business analytics. This allows you not only to focus on warehouses where data is structured, but to use significantly wider resources.

    Business analytics and big data are similar in many ways, but there are the following differences:

    • Big data is used to process a volume of information that is significantly larger than business analytics, which defines the very concept of big data.
    • With the help of big data, you can process quickly received and changing data, which leads to interactivity, i.e. in most cases, the speed of loading a web page is less than the speed of generating results.
    • Big data can be used when processing data that does not have a structure, work with which should begin only by ensuring its storage and collection. In addition, it is necessary to apply algorithms that can identify the main patterns in the created arrays.

    The process of business analytics is not very similar to the work of big data. As a rule, business analytics tends to obtain results by adding specific values: an example is the annual sales volume, calculated as the sum of all paid invoices. In the process of working with big data, calculations are made by building a model step by step:

    • putting forward a hypothesis;
    • construction of static, visual and semantic model;
    • testing the validity of the hypothesis based on the specified models;
    • putting forward the following hypothesis.

    In order to complete the research cycle, it is necessary to interpret visual meanings (interactive knowledge-based queries). An adaptive machine learning algorithm can also be developed.

    Expert opinion

    You cannot blindly rely only on the opinions of analysts

    Vyacheslav Nazarov,

    General Director of the Russian representative office of Archos, Moscow

    About a year ago, based on expert opinion, we launched a completely new tablet on the market, game console. Compactness and sufficient technical power have found recognition among fans of computer games. It should be noted that this group, despite its “narrowness,” had a fairly high purchasing power. At first, the new product collected a lot of positive reviews in the media and received an approving assessment from our partners. However, it soon became clear that tablet sales were quite low. The solution never found its mass popularity.

    Error. Our flaw was that the interests of the target audience were not fully studied. Users who prefer to play on a tablet do not require super graphics as they mostly play on simple games. Serious gamers are already accustomed to playing on a computer on more advanced platforms. There was no massive advertising of our product, the marketing campaign was also weak, and ultimately, the tablet did not find its buyer in any of the specified groups.

    Consequences. Production of the product had to be reduced by almost 40% compared to originally planned volumes. Of course, there were no big losses, nor were there any planned profits. However, this forced us to adjust some strategic objectives. The most valuable thing that we have irretrievably lost is our time.

    Adviсe. You need to think forward. Product lines need to be thought two or three steps ahead. What does it mean? When launching a certain model range today, it is desirable to understand its fate tomorrow and have at least an approximate picture of what will happen to it in a year and a half. Of course, complete detail is unlikely, but a basic plan should still be drawn up.

    And you shouldn’t trust analysts entirely. Experts’ assessments must be compared with one’s own statistical data, as well as with the operational situation on the market. If your product is not fully developed, you should not release it to the market, because for the buyer the first impression is the most important, and then convincing him will not be an easy task.

    A very important tip in case of failure is to make a quick decision. You absolutely can’t just watch and wait. Solving a problem without delay is always much easier and cheaper than fixing a neglected one.

    What problems does the big data system create?

    There are three main groups of problems of big data systems, which in foreign literature are combined into 3V - Volume, Velocity and Variety, that is:

  1. Volume.
  2. Processing speed.
  3. Lack of structure.

The issue of storing large volumes of information is associated with the need to organize certain conditions, that is, with the creation of space and opportunities. As for speed, it is associated not so much with slowdowns and braking when using outdated processing methods, but with interactivity: the faster the information processing process, the more productive the result.

  1. The problem of unstructuredness comes from the separateness of sources, their format and quality. Successful integration and processing of big data requires both work on its preparation and analytical tools or systems.
  2. The limit on the “magnitude” of the data also has a great influence. It is quite difficult to determine the value, and based on this, it is problematic to calculate what financial investments will be required and what technologies will be needed. However, for certain quantities, for example, terabytes, new processing methods are successfully used today, which are constantly being improved.
  3. The lack of generally accepted principles for working with big data is another problem, which is complicated by the aforementioned heterogeneity of flows. To solve this problem, new methods of big data analysis are being created. Based on the statements of representatives of universities in New York, Washington and California, the creation of a separate discipline and even the science of big data is not far off. This is what it is main reason that companies are in no hurry to introduce projects related to big data. Another factor is high cost.
  4. Difficulties also arise in the selection of data for analysis and the algorithm of actions. To date, there is no understanding of what data carries valuable information and requires big data analytics, and what data can be ignored. In this situation, one more thing becomes clear - there are not enough industry professionals on the market who can cope with in-depth analysis, make a report on solving the problem and, accordingly, thereby bring profit.
  5. There is also a moral side to the question: is collecting data without the user’s knowledge different from a gross invasion of privacy? It is worth noting that data collection improves the quality of life: for example, continuous data collection in Google and Yandex systems helps companies improve their services depending on consumer needs. The systems of these services note every user click, his location and sites visited, all messages and purchases - and all this makes it possible to display advertising based on user behavior. The user did not consent to the collection of data: no such choice was provided. This leads to the next problem: how secure is the information stored? For example, information about potential buyers, the history of their purchases and transitions to various sites can help solve many business problems, but whether the platform that buyers use is safe is very important. controversial issue. Many people appeal to the fact that today not a single data storage facility - even military service servers - is sufficiently protected from hacker attacks.
  • Trade secrets: protection and penalties for disclosure

Step-by-step use of big data

Stage 1. Technological implementation of the company in a strategic project.

The tasks of technical specialists include preliminary development of the development concept: analysis of development paths in areas that need it most.

To determine the composition and tasks, a conversation is held with customers, as a result of which the required resources are analyzed. At the same time, the organization decides to outsource all tasks completely or to create a hybrid team consisting of specialists from this and any other organizations.

According to statistics, a large number of companies use exactly this scheme: having a team of experts inside, monitoring the quality of work and forming a movement, and outside, directly testing hypotheses about the development of any direction.

Step 2: Finding a data scientist.

The manager assembles the staff of workers collectively. He is also responsible for the development of the project. HR employees play a direct role in creating the internal team.

First of all, such a team needs a data analyst engineer, also known as data scientist, who will deal with the task of forming hypotheses and analyzing an array of information. The correlations he identifies will be used in the future to establish new products and services.

Especially at the initial stages it is important task of the HR department. Its employees decide who exactly will do the work aimed at developing the project, where to get it and how to motivate it. It is not so easy to find a data analyst engineer, so this is a “piece product”.

Every serious company must have a specialist of this profile, otherwise the focus of the project is lost. Analytical engineer combined: developer, analyst and business analyst. In addition, he must have communication skills to demonstrate the results of his activities and a wealth of knowledge and skills to explain his thoughts in detail.

  • 24 thoughts that start big changes in life

Search examples

1. A taxi company “Big Data” was organized in Moscow. Along the route, passengers answered tasks in the field of professional analytics. If the passenger answered most of the questions correctly, the company offered him a job. The main disadvantage of this recruitment technique is the reluctance of the majority to participate in such projects. Only a few people agreed to the interview.

2. Holding a special competition in business analytics with some kind of prize. A large Russian bank used this method. As a result, more than 1,000 people participated in the hackathon competition. Those who achieved the highest success in the competition were offered a job. Unfortunately, most of the winners did not express a desire to receive the position, since their motivation was only the prize. But still, several people agreed to work in the team.

3. Search among data specialists who understand business analytics and are able to restore order by building the correct algorithm of actions. The necessary skills of a specialist analyst include: programming, knowledge of Python, R, Statistica, Rapidminer and other knowledge that is no less important for a business analyst.

Stage 3. Creating a team for development.

A well-coordinated team is needed. When considering advanced analytics, such as company innovation, a manager will be required to create and develop business intelligence.

Research Engineer is engaged in constructing and testing hypotheses for the successful development of the chosen vector.

To the head it is necessary to organize the development of the chosen line of business, create new products and coordinate them with customers. His responsibilities, in addition, include the calculation of business cases.

A development manager must work closely with everyone. The analytical engineer and business development manager identify the needs and opportunities for big data analysis through meetings with employees responsible for various areas of the project. After analyzing the situation, the manager creates cases, thanks to which the company will make decisions on the further development of a direction, service or product.

  • Development manager: requirements and job description

3 principles of working with bigdata

We can highlight the main methods of working with big data:

  1. Horizontal scalability. Due to the fact that there must be a huge amount of data, any system that processes a large amount of information will be expandable. For example, if the volume of data has increased several times, the volume of hardware in the cluster has accordingly increased by the same amount.
  2. Fault tolerance. Based on the principle of horizontal scalability, we can conclude that there are a large number of machines in the cluster. For example, Yahoo's Hadoop cluster has more than 42,000 of them. All methods of working with big data must take into account possible malfunctions and look for ways to cope with problems without consequences.
  3. Data locality. Data stored in large systems is distributed across a fairly large number of machines. Therefore, in a situation where data is stored on server No. 1 and processed on server No. 2, we cannot exclude the possibility that their transfer will cost more than processing. That is why during design, great attention is paid to ensuring that data is stored and processed on one computer.

All methods of working with big data, one way or another, adhere to these three principles.

How to use the big data system

Effective big data solutions for a wide variety of business areas are achieved through the many combinations of software and hardware that currently exist.

Important dignitybigdata- the ability to use new tools with those already used in this area. This plays a particularly important role in situations with cross-disciplinary projects. An example is multi-channel sales and customer support.

To work with big data, a certain sequence is important:

  • First, data is collected;
  • then the information is structured. For this purpose, dashboards are used ( Dashboards - structuring tools;
  • at the next stage, insights and contexts are created, on the basis of which recommendations for decision-making are formed. Due to the high costs of data collection, the main task is to determine the purpose of using the information obtained.

Example. Advertising agencies may use location information aggregated from telecommunications companies. This approach will provide targeted advertising. The same information is applicable in other areas related to the provision and sale of services and goods.

The information obtained in this way may be key in deciding whether to open a store in a particular area.

If we consider the case of using outdoor billboards in London, there is no doubt that today such an experience is only possible if a special measuring device is placed near each billboard. At the same time, mobile operators always know basic information about their subscribers: their location, Family status and so on.

Another potential area of ​​application for big data is collecting information about the number of visitors to various events.

Example. The organizers of football matches are not able to know the exact number of people who came to the match in advance. However, they would receive such information if they used information from mobile operators: where potential visitors are located for a certain period of time - a month, a week, a day - before the match. It turns out that the organizers would have the opportunity to plan the location of the event depending on the preferences of the target audience.

Big data also provides incomparable benefits for the banking sector, which can use the processed data to identify unscrupulous cardholders.

Example. When a card holder reports its loss or theft, the bank has the opportunity to track the location of the card used for payment and the holder’s mobile phone to verify the veracity of the information. Thus, the bank representative has the opportunity to see that the payment card and the holder’s mobile phone are in the same area. This means that the owner uses the card.

Thanks to the benefits of this kind of information, the use of information gives companies many new opportunities, and the big data market continues to develop.

The main difficulty in implementing big data is the complexity of calculating the case. This process is complicated by the presence of a large number of unknowns.

It is quite difficult to make any predictions for the future, while data about the past is not always within reach. In this situation, the most important thing is planning your initial actions:

  1. Defining a specific issue in solving which big data processing technology will be applied will help determine the concept and set the vector further actions. Having focused on collecting information specifically on this issue, it is also worth taking advantage of all available tools and methods to get a clearer picture. Moreover, this approach will greatly facilitate the decision-making process in the future.
  2. The likelihood that a big data project will be implemented by a team without certain skills and experience is extremely low. The knowledge that needs to be used in such complex research is usually acquired through long labor, which is why previous experience is so important in this field. It is difficult to overestimate the influence of a culture of using information obtained through such research. They provide various opportunities, including the abuse of received materials. To use information for good, you should adhere to elementary rules correct data processing.
  3. Insights are the core value of technology. The market still experiences an acute shortage of strong specialists who have an understanding of the laws of doing business, the importance of information and the scope of its application. One cannot ignore the fact that data analysis is key way To achieve your goals and develop your business, you need to strive to develop a specific model of behavior and perception. In this case, big data will be beneficial and play a positive role in solving business management issues.

Successful cases of big data implementation

Some of the cases listed below were more successful in data collection, others - in big data analytics and ways to apply the data obtained during the study.

  1. « Tinkoff Credit Systems» used the EMC2 Greenplum platform for massively parallel computing. Due to the continuous increase in the flow of card users in the bank, there was a need to make data processing faster. It was decided to use big data and work with unstructured information, as well as corporate information that was obtained from disparate sources. It did not escape the attention of their specialists that the analytical layer of the federal data warehouse is being introduced on the website of the Russian Federal Tax Service. Subsequently, on its basis, it is planned to organize a space that provides access to tax system data for subsequent processing and obtaining statistical data.
  2. The Russian startup is worth considering separately Synqera, engaged in big data online analysis and developed the Simplate platform. The bottom line is that a large amount of data is processed, data about consumers, their purchases, age, mood and state of mind are analyzed. A chain of cosmetics stores installed sensors at checkouts that can recognize customer emotions. After determining the mood, information about the buyer and the time of purchase are analyzed. After this, the buyer receives targeted information about discounts and promotions. This solution increased consumer loyalty and was able to increase the seller's income.
  3. We should also talk about a case study on the use of big data technologies in a company Dunkin'Donuts, which, similar to the previous example, used online analysis to increase profits. So, at retail outlets, displays displayed special offers, the contents of which changed every minute. The basis for substitutions in the text was both the time of day and the product in stock. From cash receipts, the company received information about which items were in greatest demand. This method allowed us to increase income and inventory turnover.

Thus, processing big data has a positive effect on solving business problems. An important factor, of course, is the choice of strategy and the use of the latest developments in the field of big data.

Information about the company

Archos. Field of activity: production and sale of electronic equipment. Territory: sales offices are open in nine countries (Spain, China, Russia, USA, France, etc.). Number of branch staff: 5 (in the Russian representative office).