R, RStudio, Python, Scala, Java

"Modern Data Science"

Today, Modern Data Science stands on the shoulders of traditional Data Science and Six Sigma for statistical uses and goes much further with big and rich data.  Modern Data Science uses regression and prediction, classification, and hypothesis testing, as well as adding deep learning and methodologies for "recommendation systems."  Some people see these as modern Artificial Intelligence (AI) systems.  Early advances were made by Google pioneers for Hadoop while managing a world-class internet search service at scale.  Facebook has been key for advanced social enablement, data collection, and recommendation engines.  Our world has advanced rapidly with Big Data and Modern Data Science because of these commendable and early efforts.  Data Science is an emerging area of study for many college programs.

"Data scientist" has also become a popular occupation with Harvard Business Review dubbing it "The Sexiest Job of the 21st Century"

There is also room for maverick data discovery with Modern Data Science. Maverick creative finds can benefit their Company’s quickly.  Our structured approach allows for creativity at its highest forms while capturing structured steps for quick production uses when discoveries are found.

"Data Science Process" diagram (Wikipedia) - Decomposition (orange points)

Below is the process shown in Wikipedia for Data Science.  You will notice that steps 1 through 3 are traditional steps for the Extract, Transform, and Load (ETL) process for Enterprise Data Warehouses (EDWs).  Data Science can include formal or end-user informal ETL.  The orange circle numbers and notes were added by Alexicon to demonstrate our understanding and focus on embedding learned know-how from Data Science activates (4) into the formal (1, 2 & 3) ETL process and/or including them in the database layer (3 & 5) as embedded Models & Algorithms for runtime computations.  The EDW will end up having a “clean dataset” or table(s) and algorithms or computation codes for database use.

"From the business perspective, data science is an integral part of competitive intelligence, a newly emerging field that encompasses a number of activities, such as data mining and data analysis."

Source: wikipedia.org/wiki/Data_science

In-database Computations Are Better and Faster

The place for proven computations is in the database (not client desktop tools).  Algorithmically efficient computations are important with large data sets.  They must also be meticulously vetted for accuracy.  Users expect response times from newer databases in under two seconds versus many seconds or minutes from traditional big reporting systems.  There is still the need to balance summary and detail levels.  The goal is to move computations to run in the database at super-fast speeds, with summary or aggregate results sent to the visualization layer or report.  Smaller result sets from the database are easier on the network, as well.  Fast query results help BI Tool Users by providing quick access to get what they want when they need it.  User and analyst time typically runs at a premium.  Many BI systems are a learning hub for a company by fostering user exploration and use (speed is important).  Highly used systems provide great return on investments (ROIs) across the income statement (P&L) and balance sheet over many years.

High-dimensional Statistics

Advanced uses in massive multidimensional spaces is one of our key focuses.  These new systems consider many text and numeric values, including where you live, your likes, your dislikes, your gender, and many other aspects of your publicly available information or “consumer data.”  There is also an abundance of Worldwide Government and Business Data.  Common usage areas for data are financial, geographic points and areas, weather, and Industrial IoT (sensor or machine data).  We see a strong relationship between multidimensional BI and high-dimensional statistics.  Both potentially have many dimensions and many numeric values in one related or analyzed schema on conformed dimensions or matched data to relate and contrast different measurable aspects of a business.

Modern Data Science makes possible newer and valuable capabilities like recommendation engines.  We see these recommendation engines in play when shopping online with similar product recommendations or ads being displayed while we browse or shop.  These engines are capable of understanding and associating people to products or people to people because of high-dimensional statistics.  This is the big win! It is difficult for people to see and comprehend more than two dimensions (it is too complex for the human mind).  We typically use a two-dimensional (2D) space or plane or X,Y plot to correlate or associate variables.  Moving from two to five dimensions gets difficult, not to mention the tens to hundreds of dimensions and/or numeric values that exist in Enterprise and Big Data environments today.

We are doing things today that were never before possible because of high-dimensional statistics.  This is where Modern Data Science and Machine Learning capabilities have taken a quantum-leap in recent years around prediction within massive data sets, which have data with high volume, variety, and velocity (the three V’s).  This creates complex and fast-moving or even streaming data environments, where Modern Data Science is used as ETL to transform and load data on the fly and/or perform real-time actions.

Data that is classified correctly allows users to contrast and compare different aspects of the enterprise with advanced business analytics capabilities.  We could draw a resemblance between Modern Data Science and radar when it was originally introduced.  Radar was used for military purposes to allow our forces to have long-range visibility even in darkness and fog (humans could do this without radar).  This is much like high-dimensional statistics and finding needed or important information in massive multidimensional spaces without the machine’s help to see what is not humanly possible otherwise.

Recommendation systems have fostered modern advancements with high-dimensional statistics through deep learning and other techniques used by early adapters like Amazon, Netflix, Yelp, Pandora, and Tinder for online use and matching (recommending) people to products or people to people.

Amazon (integrated example)

Amazon is a fine example of a company that sees the use of Big Data, Modern Data Science, and Lean Six Sigma (LSS) for internal operations, website operations, product recommissions, and taking orders.  We believe there is much crossover that occurs within a company like Amazon with these bodies of knowledge.  They have mastered the management of process and data so well that they offer Amazon Web Service (AWS) so other companies can take advantage of their advanced system infrastructure and architecture to store, compute, and analyze data with advanced capabilities.

Alexicon is committed to helping customers increase enterprise performance capabilities with advanced business analytics using proven Business Management methods, Modern Data Science, and LSS.  We believe this combination is unique and valuable to all companies when integrated and coordinated (integrated techniques and methods).  This is especially true for major corporations with initiatives to increase sales and/or profit while controlling costs.

The time to get involved in Modern Data Science is now

Your company can start with an "Enterprise Analytics Health Check" and a review of known external and internal data sources or data landscapes to identify where Modern Data Science capability can be used to provide added analytic power for your enterprise.

In our mind, Modern Data Science will be integral to the success of major corporations in the coming years, as competition increases. What’s more, with new opportunities that are coming down the pike for American businesses, Modern Data Science will provide the ability to ramp up operations to meet that new demand and ensure that businesses can corner the best possible portions of the market before new entries or competitors take that early and important share.

Contact Us to learn more about Modern Data Science

«Home Page