Quantcast
Channel: SAP HANA and In-Memory Computing
Viewing all 927 articles
Browse latest View live

Comparing SAP HANA and Sybase IQ - real world performance tests

$
0
0

When SAP talks about HANA, they quite often talk about the 1000x improvements in performance that you can get. A customer asked me last week why SAP HANA would provide any improvement over their implementation of Sybase IQ, if they pinned all the IQ tables in memory. They conjectured that IQ should be just as fast as HANA, right?

 

In fact, there are several capabilities of HANA as compared to IQ, which should make it substantially faster in the real world, even when IQ operates entirely in-memory. There is a nice blog written by Chris Jones which you can read here, which explains this and some other stuff.

 

The important thing to note is that IQ is a disk-based data-warehouse that works well with large volumes of memory for caching. HANA is a transactional developer platform written to be in-memory. As we shall see, there are pros and cons to both.

 

- Both HANA and IQ compress data. But, IQ compresses in bitmaps, and HANA compresses with dictionary encoding which means that HANA only needs to decompress exactly the data it needs, and it does so inside the CPU cache. Because in-memory databases are limited by network bandwidth, this should make HANA much faster than IQ for anything which requires materialization of data.

- HANA is optimized for individual writes. With IQ, you lock a table on write, and it is locked until you specify a commit. This means that you can't have multiple updaters in IQ, whilst HANA has a multi-version concurrency control (MVCC). This means in practice that many people can write to a HANA table with individual writes, whilst IQ requires one person writing to a table, and in batch. It also should mean that IQ is much faster for bulk loading than HANA. In fact, Sybase IQ holds the world record for bulk loading at 34TB/hour.

- Both databases have support for SIMD, but HANA is highly optimized for the Intel E7 platform and its SSE2 instructions, which allow multiple additions/multiplications in one CPU cycle. This should mean that combined with the compression, HANA is faster at aggregating than IQ.

 

So, I decided to load the same data into IQ and HANA, and do some comparisons on the same hardware.

 

Test Environment

 

For this, I used a SAP HANA size "Medium" system from HP. It's has 4 CPUs, 40 cores, 512GB RAM, 25x15k 146GB SAS for data and one FusionIO 640GB for logs. The OS is SUSE Linux for SAP Applications. For my testing I use one database at a time.

 

For SAP HANA, I used SAP HANA 1.0 SP6 Rev.69, which is the latest available.

 

For IQ, I used Sybase IQ 16.0 SP2, which is also the latest available.

 

Installation

 

It's my first time getting to grips with Sybase IQ and Mark Mumy was a big help in this SCN thread. IQ doesn't come pre-configured out the box and you have to set a few settings to get things to work well. In my database configuration file, I set the following settings:

 

-c 1g

-iqmc 154000

-iqtc 154000

-iqlm 154000

-iqnumbercpus 40

 

This basically tells IQ to use 1GB for the cache, and to split the remaining 512GB ram amongst the various processes (30% of RAM to each process). Plus because of hyper threading, IQ thinks I have 80 CPUs, so I have to tell it that I actually have 40.

 

HANA is definitely easier to install, and requires no special configuration, but this isn't a big deal in the scheme of things.

 

Data Loading

 

The Sybase bulk loader is pretty fiddly and very specific about the file format, number of columns and data quality. Actually this is pretty much the feeling of the IQ platform overall - fantastic technology mixed with a relatively poor user experience. The HANA bulk loader isn't very feature-rich, but it is much less picky than the Sybase loader. This is definitely an area that both platforms could work on.

 

Once you get IQ up and running though, it flies for loading. I'm sure that it could be better optimized by an IQ pro, but I found I could load my 62GB fact table in 2m30s. By comparison, I need 10x this long to load the data into HANA. This doesn't come as a surprise, because IQ doesn't have to worry about multiple inserters, or dictionary encoding. With HANA, you trade off load performance for the behavior of a transactional RDBMS. IQ is a pure data warehouse.

 

Queries and Aggregations

 

Vishal Sikka often talks about how HANA can aggregate 16m/sec/core. In my 40-core system that should translate to 640m aggregations/sec/core. I actually find it is much more variable than this and depends on the join complexity and grouping sets. For a simple table, you can get as much as 31m, and for very complex joins and grouping I see as low as 9m. You will see this in the results below.

 

In both cases, I see massively parallel behavior and all 40 cores are used simultaneously.

 

Still - with my 1.4bn table, I have four queries. I generally find that most questions you can ask fall into one of these categories in terms of performance.

 

QuerySAP HANA
Sybase IQ
SELECT SUM(AMOUNT)/COUNT(AMOUNT) FROM TRANSACTION1.2s1.2s

SELECT GENDER, SUM(AMOUNT)/COUNT(AMOUNT)

FROM TRANSACTION T

JOIN CUSTOMER C ON T.CUSTOMER_ID=C.CUSTOMER_ID

GROUP BY GENDER

1.7s18.9s

SELECT MERCHANT, GENDER, SUM(AMOUNT)/COUNT(AMOUNT)

FROM TRANSACTION T

JOIN CUSTOMER C ON T.CUSTOMER_ID=C.CUSTOMER_ID

JOIN MERCHANT M ON T.MERCHANT_ID=M.MERCHANT_ID

GROUP BY MERCHANT, GENDER

3.4s35.3s

SELECT GENDER, SUM(AMOUNT)/COUNT(DISTINCT T.CUSTOMER_ID)

FROM TRANSACTION T

JOIN CUSTOMER C ON T.CUSTOMER_ID=C.CUSTOMER_ID

JOIN MERCHANT M ON T.MERCHANT_ID=M.MERCHANT_ID

GROUP BY MERCHANT, GENDER

14s171.8s

select gender, sum(txamount)/count(txamount), count(txamount), count(distinct t.customer_id)

from transaction t

join customer c on t.customer_id=c.customer_id

WHERE gender='M'

AND dob BETWEEN '1980-01-01' AND '1989-12-31'

AND maritalstatus='S'

AND postcode like '%TW1%'

group by gender;

3.2s0.9s
SELECT STDDEV(TXAMOUNT) FROM TRANSACTION409s3.8s

 

With IQ, we see a very similar response time to HANA for Query 1. I'm not sure why that is, but I'm guessing IQ does SUM() and COUNT() on a single table very efficiently. I'd be interested in any IQ expert that can explain this.

 

Once we get into the realm of complex joins and grouping, HANA outperforms IQ 10:1. This is roughly what we expect because HANA stores its data for faster OLAP retrieval.

 

Interestingly, the last two questions, where we have a lot of restrictions, plus a COUNT DISTINCT, or a STDDEV, IQ outperforms HANA. We see this in a few places, where IQ's more mature OLAP engine can outperform HANA.

 

The HANA Development Platform

 

This little test doesn't bring out a lot of the qualities of HANA. It's worth making a quick note of these:

 

- If we want to do insertion of data at the same time as loading, HANA queries will continue to run very nicely. It's less clear to me how IQ will behave.

- With HANA, we have a set of engines which run against the core data: predictive, spatial and business functions.

- HANA provides a full development platform including a development IDE, Integration Services and a Web Server for application build.

- With HANA, we need only store one set of data for OLTP and OLAP workloads. IQ only works as a data warehouse and it requires a separate transactional system.


Conclusions

 

There's no question about it - in the real world, with complex grouping sets, HANA performs 10x as well as IQ. It's worth noting that IQ is a very fast data warehouse - especially compared to a regular RDBMS.

 

But make no mistake, IQ is an excellent data warehouse, and its more mature OLAP engine means that for certain operations, it can significantly outperform HANA.

 

With each release of SAP HANA, the development team optimizes more and more functionality and I have no doubt that what we see here may be very different in future revisions. SAP HANA SP07 is released very soon and I'm interested to see what that will bring.

 

Quick thank you to Mark Mumy for his assistance getting Sybase IQ running nicely!


Data Science Series – What is DS?

$
0
0

This is one of the evolving technology , In this series of blogs I am going to explain all the details & Role of DS in Big data as well.

 

Introduction:

Data science is a steadily growing discipline that is powering significant change in across industries and in companies of every size. It is emerging as a critical source of insights for enterprises faced with massive amounts of data but with no plan for how to systematically extract the value lying therein. The 2011 McKinsey Global Institute report, Big data: The next frontier for innovation, competition, and productivity, paints a stark picture of the shortage of human capital and technological insight needed to tap the full potential of the world's data resources.

 

What is Data Science?

Data scienceincorporates varying elements and builds on techniques and theories from many fields, including mathematics, statistics, data engineering, pattern recognition and learning, advanced computing, visualization, uncertainty modeling, data warehousing, and high performance computing with the goal of extracting meaning from data and creating data products.

Data science is a novel term that is often used interchangeably with competitive intelligence or business analytics, although it is becoming more common.

A practitioner of data science is called a data scientist.

Data scientists solve complex data problems through employing deep expertise in some scientific discipline. It is generally expected that data scientists are able to work with various elements of mathematics, statistics and computer science, although expertise in these subjects are not required. However, a data scientist is most likely to be an expert in only one or two of these disciplines and proficient in another two or three. Data science must be practiced as a team, where across the membership of the team there is expertise and proficiency across all the disciplines.

Good data scientists are able to apply their skills to achieve a broad spectrum of end results. Some of these include the ability to find and interpret rich data sources, manage large amounts of data despite hardware, software and bandwidth constraints, merge data sources together, ensure consistency of data-sets, create visualizations to aid in understanding data and building rich tools that enable others to work effectively. The skill-sets and competencies that data scientists employ vary widely. Data scientists are an integral part of competitive intelligence, a newly emerging field that encompasses a number of activities, such as data mining and analysis, that can help businesses gain a competitive edge.

A major goal of data science is to make it easier for others to find and data with greater ease. Data science technologies impact how we access data and conduct research across various domains, including the biological sciences, medical informatics, social sciences and the humanities.

 

“Data science isn’t just about the existence of data, or making guesses about what that data might mean; it’s about testing hypotheses and making sure that the conclusions you’re drawing from the data are valid.”

Data Scientists perform data science. They use technology and skills to increase awareness, clarity and direction for those working with data. The data scientist role is here to accommodate the rapid changes that occur in our modern day environment and are bestowed the task of minimizing the disruption that technology and data is having on the way we work, play and learn. Data Scientists don’t just present data, data scientist’s present data with an intelligence awareness of the consequences of presenting that data.

 

          Data Science Disciplines

DS_1.JPG                 

What type of skills needed for Data Scientist?


It’s a hybrid role that combines the “applied scientist” with the “data engineer”. Many developers, statisticians, analysts and IT professionals have some partial background and are looking to make the transition into data science.

 

Your approach will likely depend on your previous experience. Here are some perspectives below from developers to business analysts.

DS_2.JPG

Two more Blogs on the way...

Big Data webinar series a theme that resonated in 2013

$
0
0

With more and more people spending much of their existence in the digital world – whether it’s for work, play, learning, or to socialize– the amount of data being generated passively is growing exponentially. We have seen the volume of data that organizations have to manage and process grow from gigabytes(GB), to Terabytes (1000 GB) and even Petabytes (1 million GB). In 2012 alone, industry analyst IDC estimates that the amount of information created and replicated exceeded 2.8 zettabytes (ZBs), or trillions of gigabytes.   Organizations certainly have no lack of data today - what they lack is how to make the data useful i.e. harness the deluge of data to drive economic value while it still matters. And the real bottleneck is not technology but the availability of practitioners that can envision the possibilities and use the available technologies appropriately.

 

Have you been following the growing momentum of Big Data conversations on-line and at events? Do you consider it just another marketing buzzword and a passing phenomenon?

 

I love Tammy Powlas comment in her blog

 

“Quite honestly I had been treating the topic as all “hype” but thanks to participating in a Google Hangout  with SAP’s Steve Lucas and Timo Elliott the other week I’ve started paying attention to the topic”

 

 

Has this sparked your interest in finding out more about Big Data? Then I am sure that the Big Data webinars  being conducted by SAP and its eco-system is exactly what you are looking for.

 

1. Check out the Big Data webinar series that I conceived and executed together with Daniel von Dungen from SAP University Alliances.  We hosted over  16 webinars  in which SAP employees and SAP Mentors shared their expertise and insights on a wide range of topics that such as:

    • Big Data concepts and opportunities :  What is Big Data, How to avoid the Big Data chaos, Big Data Apps for Retail,  BI in a Big Data world, Big Data maturity model
    • Practical hands-on technology sessions: Movie sentiment analysis, Gaming App on HANA, Visualizing Big and small data, Set-up and use of Smart Data Access in SP6 of SAP HANA
    • How SAP technologies and Hadoop that can be used to address Big Data requirements such as Text,Geospatial, data visualization and Streaming 

2.  Secondly, Program Management & Maintenance Strategies SIG at ASUG presented ASUG’s 5 part webinar series on Big Data . The series started by explaining the basics of big data and the business benefits of Big Data and ended with providing fairly technical details on how to use SAP big data technologies. The series provides ASUG members the knowledge to identify the opportunity big data offers and the knowledge to start projects for implementing solutions using SAP big data technologies with Hadoop and SAP HANA.  Listen to David Burdett, Strategic Technology Advisor, SAP, and John Choate, National Chair, PMMS SIG ASUG, introduce their new webinar series :

 

 

3. Thirdly, Big Data is important but you cannot take advantage of it without the right data management platform – one that helps you find relevancies within your data and incorporate them into business processes.  Implementing Data Management Strategies for a Big Data-enabled EDW is a 3 part webcast series developed by Courtney Driscoll.  The series provides valuable insights to help you turn your Big Data into a key enterprise asset.

 

To make it easy for you to find the webinars – both upcoming and on-demand recordings then be sure to check out the document that I maintain on Big Data webinar series .  I hope you find these webinars useful.

 

Additionally be sure to read the thought leadership articles on the big data website to discover how harnessing the value of Big Data can help your organization compete more effectively.

 

@rukso

SAP & Hortonworks Hangout: Big Data Meets Enterprise Systems

$
0
0

Last month, I hosted a “Big Data Hangout” featuring Hortonworks CTO Ari Zilka and SAP’s Irfan Khan talking about how the future of information architectures that combine the best of today’s Hadoop and Enterprise infrastructures.

 

The recording of that session is now available:

 

 

Interested in hearing more? There's another Big Data hangout scheduled for next week, on the topic of "Data Scientists" -- and there are many others

 

In addition, the voting on sessions for the 2014 Hadoop Summit in Europe is now open, and I submitted a session to talk about some of the great, real-life business cases that I’ve been seeing around the world – not just in web companies, but normal businesses of every size – please vote here!

 

Real-Life Examples of Using Hadoop to Drive Business Innovation

Hadoop creates fundamentally new, disruptive opportunities for business — but what is the best way to turn technology possibilities into business advantage? Drawing from a wide selection of use cases in different industries, this session looks at real-life examples of business innovation through the integration of Hadoop, enterprise information systems, and business creativity.

Why Enterprises Should Be More Interested in Hadoop

$
0
0

hanadoop-example

An example use case of Hadoop and in-memory systems extracted from the CIO guide to using Hadoop and SAP systems.


It’s Time For Two Worlds To Come Together

Earlier this year, I attended the Hadoop Summit in Europe, sponsored by Hortonworks. There were many excellent presentations at the conference, but the divide between “old” and “new” analytics was very clear. There were relatively few “traditional” companies presenting sessions, and those that were seemed faintly embarrassed to mention that they still had data warehouses.

 

The Hadoop use cases discussed were mainly new, standalone systems rather than integrations with with more traditional systems or analytic architectures. There were two notable exceptions.

 

alasdair_andersonThe first was by Alasdair Anderson, Global Head of Architecture for HSBC Global Banking and Markets, who presented on the theme of “Enterprise Integration of Disruptive Technologies.”

 

The bank needed a single data platform that could provide 360-degree views of clients, operations and products. To provide this, the team had been struggling with a complex, “brittle” architecture based on over 150 source systems, 900 ETL jobs, 3 data warehouses, and 15 data marts.

 

The resulting system was expensive, and too slow to meet the business needs: it took months or years to make changes. The team concluded that they needed a different way of doing things, one that would support more agile, parallel streams of development, without being disruptive.

 

hsbc

 

HSBC decided to try using Hadoop, with the work done in Gaungzhou, China. The project was a big success:

  • Hadoop was installed and operational in a single week
  • The 18 RDBMS data warehouses and marts were ported to Hadoop in 4 weeks
  • The time it took to run an existing batch job dropped from 3 hours to 10 minutes
  • New data sources could be included, such as information about financial derivatives stored in .pdf format.

 

At the same time, however, Anderson explained how his analytics needs were maybe a little different from more traditional data warehousing. The focus of the project was fast-moving, “agile information” typically requiring several different iterations of analysis — and he explained that other parts of the business such as the retail banking division, did not have the same “funky needs.”

 

He admitted that  HSBC “genuinely doesn’t know yet how the new architecture will combine with existing data warehouses in other regions”

The second notable presentation was Deutsche Telekom’s Jürgen Urbanksi on how to determine the right technical solutions for different types of enterprise data usage. He presented an overall view of different architectures and suggested questions that should be asked of the business in order to determine which technology was the best fit.

 

Hadoop was generally positioned as the “better” choice, although some of the comparisons with in-memory systems already seemed out of date (e.g. see slide below), and there was little discussion of how to integrate transaction systems (other than as simple data sources).

 

deutsche telekom data storage

 

Integrating Hadoop With Existing Systems

 

The unspoken assumption of many of the delegates seemed to be that it was just a question of time before Hadoop gained the extra features that would enable it to take over all enterprise needs. Some seemed almost proud to ignore existing enterprise data architectures and any best practice learned over the previous decades (information governance springs to mind – this is still a very new concept for many organizations using Hadoop).

 

Many users of enterprise systems, on the other hand, seemed to have decided that Hadoop that only applies to web companies, or is restricted to refining semi-structured data before putting it into a “normal” data warehouse.

 

I believe that Hadoop is an incredible opportunity for most enterprises, both large and small. But I also believe that the big changes in enterprise architecture driven by in-memory systems, and the need for analytics close to transactions, mean that the ultimate best practice architecture will be one based on a combination of existing approaches, not just Hadoop alone.

 

With this in mind, hadoop-logoSAP has teamed up with major Hadoop providers to combine the speed of in-memory computing with the storage power and flexibility of Hadoop.

 

SAP will redistribute and support the Intel Distribution Apache Hadoop and the Hortonworks Data Platform— InformationWeek journalist Doug Henschen explains the background to the deals in his article SAP Expands Big Data Push.

 

As part of the series of SAP “Big Data” discussions, Hortonworks CTO Ari Zilka explained why he felt that combining Hadoop and enterprise systems was the best of both worlds.  And there’s more information about how real-life organizations are using Hadoop in their organization, in organizations as diverse as football and genetics, visit the SAP Big Data website.

 

For more detailed technical information about Hadoop can be integrated with traditional information architectures, check out the CIO Guide on Big Data: How to Use Hadoop With Your SAP Software Landscape.

 

Hadoop Summit Europe 2014

 

I’m looking forward to next year’s Hadoop Summit Europe in April, and thoroughly recommend you attend. It’s a great venue and a great crowd. And please vote for my presentation on “Real-Life Examples of Using Hadoop to Drive Business Innovation”!

The first heresy: Could Cloud Foundry run on the HANA Enterprise Cloud?

$
0
0

After reading a recent blog entitled “Cloud Foundry, aiming for ubiquity, sets sights on Google Cloud”, I decided to examine the Cloud Foundry in more detail to understand how it might fit into the broader SAP cloud environment.

 

The area of the greatest potential appears to be the HANA Enterprise Cloud (HEC).

 

Architecture of the HANA Enterprise Cloud

 

Note: In this blog, the Application Management Services (AMS) portions of the HANA Enterprise Cloud aren’t going to be considered. I’m more interested in the broader architecture of the platform.

 

Let’s take a quick look at the architecture of the HANA Enterprise Cloud.

 

Here is detailed diagram provided by SAP

 

image001.jpg

 

Let’s simplify this diagram a bit.

 

image002.jpg

 

The last step reduces HEC to its simplest structure and shows that HEC is more than just an IaaS. One recent presentation about SAP's cloud offerings refers to HEC as ERPaaS which isn’t really true since other applications are also offered in this environment.

 

It also important to remember that the IaaS layer currently contains a mix of physical boxes for HANA and virtualized servers for other components such as application servers, etc.

 

Enter Cloud Foundry

 

Cloud Foundry is an open source cloud computing platform as a service (PaaS).

 

Here is a very high-level portrayal of the Cloud Foundry architecture.

 

image004.jpg

 

[SOURCE]

 

The interface with various clouds types (Private, Public, etc.) is where we want to focus our attention.

 

It is first important to say that Cloud Foundry is designed to be IaaS neutral. This goal is achieved via the Cloud Provider Interface (CPI).

A Cloud Provider Interface (CPI) is an API that BOSH (Foundry’s deployment installer and operations manager) uses to interact with an Infrastructure as a Service (IaaS) provider to create and manage stemcells and VMs. … A CPI abstracts an underlying virtualized infrastructure from the rest of BOSH, and is fundamental to Cloud Foundry's model for deploying and running applications across multiple clouds. [SOURCE] (my emphasis / addition)

Note: For a deep dive into how a CPI works, I recommend this blog.

 

Currently, CPIs exist for the following IaaS environments

 

IaaS

Status

CloudStack

Available

Azure

Planned

VMware vSphere

Available

VMware vCloud Director

Available

AWS

Available

Google Cloud Platform

In Progress

OpenStack

Available

 

Let’s create a simple diagram of the CPI-related architecture. 

image004.jpg

 

Porting Cloud Foundry to the HANA Enterprise Cloud

 

Note: Before we examine this scenario in more detail, it is useful to remember the announcement regarding Cloud Foundry at the Las Vegas TechEd. I’ve analyzed this announcement and my assumption is that applications running on Cloud Foundry will soon be able to use HANA as a database.

 

A port of Cloud Foundry to the HEC would have this general architecture:

image005.jpg

Here is a a more detailed representation of the target architecture:

image006.jpg

 

The only necessary implementation would be the creation of the CPI for this integration.  If the virtualized machines used in HEC were from a vendor (VMWare?) already supported by another CPI, then only support for the HANA boxes would be necessary to implement from scratch.

 

The question would be whether the optimal solution would be direct CPI access to the virtualized machines / physical HANA Servers.  Another option that might be easier to implement would be the use of the existing SAP HEC Orchestration tools.

 

image007.jpg

 

This orchestration layer has its own REST APIs that could be the entry point for the CPI integration.

 

image008.jpg

[SOURCE]

 

Concerning the difficulty in creating a CPI, one blogger made this remark:

With that said, it is rather amazing that one could encapuslate all of the infrastructure-specific implementation necessary to deploy and manage a distributed system as powerful as Cloud Foundry in less than twenty classes and 1700 lines of code.

 

Thus, a CPI for the HEC seems like a possible / realistic undertaking.


But why?

“If we have no heretics we must invent them, for heresy is essential to health and growth.”  - Yevgeny Zamyatin [SOURCE]

 

Just because something is theoretically possible, doesn’t mean that it should really be done.  I’ve just described how Cloud Foundry might be able to run on the HEC architecture – yet, the question might arise: why would you do this?

 

As I’ve stated before, the HEC gains in value if extensions to the hosted applications can be hosted in a single environment rather than multiple platforms. Thus, the combination of a PaaS and the HEC might be quite valuable to customers.

 

SAP already has a PaaS -the HANA Cloud Platform (HCP) – if any PaaS should run on the HEC, then it should probably be the HCP.  Indeed, there may already be ongoing activities in this area.  This evolution appears to be the natural evolution for the HEC hosted by SAP.

 

There are, however, a variety of other HEC partners who have other options / perspectives / requirements.  These partners might already have Cloud Foundry experience.  The viability to port HCP to other environments – either OnPremise environments from customers or cloud environments hosted by others besides SAP – really hasn’t been addressed yet. Thus, the ability to run Cloud Foundry in these partner-hosted environments would provide benefits / increased flexibility for customers in those environments.

HANA SPS 7 first dazzling highlights

$
0
0

Another half year passed and another SAP HANA SP has just been released.

SAP HANA 1.0 SPS 7 is out in the wild as revision 70.00 (double zero revision, yes Moneypenny ) and a lot of things had been changed, added and (hopefully) improved.

http://upload.wikimedia.org/wikipedia/en/9/9b/Miss_Moneypenny_by_Lois_Maxwell.jpg
(linked from here)

 

Don't get me wrong, this is by no means a comprehensive "What's new" report (for that, make sure to check the SPS 7 documentation as soon as it is out as well https://help.sap.com/hana_platform).
While the documentation is not yet completely available on help.sap.com make sure to check the built in online help in SAP HANA studio (reference guides).

 

Also, check out SAP note 1944771 for release information.

ATTENTION: one of the major changes is that from now on, there exist maintenance revisions that are developed besides the SP revisions.

The maintenance revisions shall only contain bug fixes and keep functionality stable.

As a consequence of that, the upgrade paths between maintenance and SP revisions must be obeyed.

Read SAP note 1948334: SAP HANA Database Update Paths for Maintenance Revisions for the details.

 

Finally, don't miss out on the WEBCAST series "SAP HANA SPS7 - Overview, Features, Details"

So this is what Ravi asked for - just quick glimpses

 

PlanViz changes

  • GUI improved

03-12-2013 23-53-01.png

There's even the option to execute a query and displaying the result set.

 

  • aborted plans still visible

failed plan.png

That's right, even though the query aborts with an error, you can review the PlanViz trace up until the POP that failed and analyze what was going on.

 

  • inverted index usage information (visible in the hover over details of POPs)

Thanks to my co-author, program lead and fellow Wuppertaler Richard from whom I stole this example...


Name: Basic predicate

ID: cs_plan13688_ld9506_30103_tableSearch1_pop1

Summary: MANDT = '800'

Execution Time (Self): 0 ms

Execution Time (Inclusive): 0 ms

Execution Start Time: 162,707 ms

Execution End Time: 162,707 ms

Estimated result size: 51

Estimation time: 0.067 ms

Inverted Index: not used

 

  • applied column filters (WHERE condition and analytic privileges) shown at the object where it is applied

search on table.png

Unfortunately there's no time for now to do more tests - but I'm quite happy to see that so many improvement requests have been fulfilled (thanks for all the due follow up Mr. K!).

 

Session grouping, filtering and aggregation features

session filters.png

session  aggregates.png

Fancy graphical display of memory and resource allocation/usage

memory overview.png

Mhmm... I like pie...

resource overview.png

Impressive how these lines have different colors and go upwards...

 

Statisticsserver process now included in indexserver process

Yep, it's not a separate process any longer, doesn't have it's own persistence any more and is much better configurable (that's what I heard).

Details are to be found in SAP note 1917938.

As it's not a very visual feature, here some trace file output line from the migration ...

 

 i STATS_CTRL       NameServerControllerThread.cpp(00203) : installing...
 i STATS_CTRL       NameServerControllerThread.cpp(00299) : old StatisticsServer: ld9506:34205, volume: 2
 i STATS_CTRL       NameServerControllerThread.cpp(00355) : waiting for start of old StatisticsServer ld9506:34205, volume: 2...
 i STATS_CTRL       NameServerControllerThread.cpp(00371) : waiting for old StatisticsServer ld9506:34205, volume: 2 to stop all operations...
 i STATS_CTRL       NameServerControllerThread.cpp(00373) : old StatisticsServer ld9506:34205, volume: 2 has stopped all operations
 i STATS_CTRL       NameServerControllerThread.cpp(00376) : old StatisticsServer is ready. starting...
 i STATS_CTRL       CallInterfaceProxy.cpp(00035) : sending install request
 i FileIO           FileStatistics.cpp(00287) : FileFactoryConfiguration::initial(path="/usr/sap/WUP/HDB42/backup/log/", AsyncWriteSubmitActive=auto,AsyncWriteSubmitBlocks=new,AsynReadSubmit=off,#SubmitQueues=1,#CompletionQueues=1)
 i STATS_CTRL       CallInterfaceProxy.cpp(00039) : response to install request: OK
 i STATS_CTRL       NameServerControllerThread.cpp(00299) : old StatisticsServer: ld9506:34205, volume: 2
 i STATS_CTRL       NameServerControllerThread.cpp(00422) : found old StatisticsServer: ld9506:34205, volume: 2, will remove it
 i STATS_CTRL       NameServerControllerThread.cpp(00425) : forcing log backup...
 i FileIO           FileStatistics.cpp(00287) : FileFactoryConfiguration::initial(path="/usr/sap/WUP/HDB42/backup/log/", AsyncWriteSubmitActive=auto,AsyncWriteSubmitBlocks=new,AsynReadSubmit=off,#SubmitQueues=1,#CompletionQueues=1)
 i STATS_CTRL       NameServerControllerThread.cpp(00430) : log backup done. Reply: [OK]
--
[OK]
--
 i STATS_CTRL       NameServerControllerThread.cpp(00433) : stopping hdbstatisticsserver...
 i STATS_CTRL       NameServerControllerThread.cpp(00458) : waiting 5 seconds for stop...
 i Service_Shutdown TREXNameServer.cpp(03734) : setStopping(statisticsserver@ld9506:34205)
 i STATS_CTRL       NameServerControllerThread.cpp(00463) : hdbstatisticsserver stopped
 i STATS_CTRL       NameServerControllerThread.cpp(00466) : remove service from topology...
 i STATS_CTRL       NameServerControllerThread.cpp(00470) : service removed from topology
 i STATS_CTRL       NameServerControllerThread.cpp(00472) : remove volume 2 from topology...
 i STATS_CTRL       NameServerControllerThread.cpp(00476) : volume removed from topology
 i STATS_CTRL       NameServerControllerThread.cpp(00478) : mark volume 2 as forbidden...
 i STATS_CTRL       NameServerControllerThread.cpp(00480) : volume marked as forbidden
 i STATS_CTRL       NameServerControllerThread.cpp(00482) : old StatisticsServer successfully removed
 i STATS_CTRL       NameServerControllerThread.cpp(00405) : removing old section from statisticsserver.ini: statisticsserver_array_BACKUP_LAST_DATA
[...]
 i STATS_CTRL       NameServerControllerThread.cpp(00405) : removing old section from statisticsserver.ini: statisticsserver_view_VOLUMES_OUT_OF_ORDER
 i STATS_CTRL       NameServerControllerThread.cpp(00409) : making sure old StatisticsServer is inactive statisticsserver.ini: statisticsserver_general, active=false
 i STATS_CTRL       NameServerControllerThread.cpp(00222) : installation done

New SQL functions

HASH_SHA256 -  "Returns a 32 byte VARBINARY hash value of the concatenated arguments. The hash is calculated using a SHA256 algorithm."

I guess that Jon and In-Sung Lee will like that...

 

Alright, admit it: you also have to blink to believe that all that made it into SPS 7, right?

It's nearly as if Xmas was coming up... wait a minute...

 

Cheers and a great holiday season everybody!

Lars

What's new in SAP HANA SP07?

$
0
0

It's that time of year again, and there is a new release of SAP HANA. If you're into that kind of thing, you can download all of the reference material here. There is 30 PDF files containing 150MB of detailed information explaining what's going on. Last year, SAP were kind enough to release SAP HANA SP05 during the Thanksgiving holiday weekend, leaving plenty of time for those of us with nothing better to do, to read up on it, but they were a week later this year, so I've had to catch up fast!

 

My sense of what happened during the planning phase for SAP HANA SP07 was the teams got together and said "what shall we achieve?". The consensus was: let's make everything better. Let's look at the stuff it brings.

 

Better SQL optimization and performance

 

This is huge, for me. The SQL optimizer now does a much better job of guessing which engine to send a query to. I've not tested this exhaustively but if you are using SQL, this can provide up to 100x improvement in performance of OLAP-style queries and SQL is now only about 10% slower than an equivalent HANA model. The Model is still faster, because the optimization is built in advance.

 

In addition, COUNT DISTINCT is now 50% faster than I found previously. This is very nice, because this was a pain point for many customers.

 

All in all this is an important set of improvements to the core engine.

 

New SQL Features

 

There aren't a ton of new SQL features, and I was hoping for more ANSI SQL compliance but there's some interesting stuff:

 

- Ability to replica tables between nodes to improve join colocation

- Handful of new SQL functions for working days, currency conversion, grouping, SHA Hashing, Binary Conversions

- New BINTEXT data type and various conversion and setting functions

 

Unfortunately recursive queries and CTEs still aren't supported, which is a shame.

 

SQLScript

 

There is a new debugger and editor but I don't see any new SQLScript functionality, which is disappointing. I was hoping for some new functionality especially around UDFs and passing of arrays, but the reference guide looks unchanged since SP6. Hopefully this will be a focus for SP8!

 

Improved Developer Experience

 

SAP has majored on this in SP7 though I think there is more wood to cut for SP8. Thankfully, the regi configuration is now gone, though regi appears still be there in the background for certain tasks. This should pave the way to the Mac OS X version of HANA Studio (yay!). I tested checking out some projects and it happens 5-10x faster than before, which is very nice.

 

The 3 views: Project Explorer, Repositories and Systems View all still exist, which is a shame, but hopefully we'll see those consolidated in SP8. Either way, from a developer perspective, this all feels much better.

 

The web-based IDE has been improved some although it still falls short of parity with SAP HANA Studio. Some serious work needs to happen here to make HANA a cloud development platform and crucial HANA Models are still only visible in XML.

 

Core Data Services gets some new functionality for the definition of relationships and views, but CDS is still very young. I suspect that the unification of the developer experience between the 3 different views, CDS, the HANA Analytic and Calc Views will be a core effort for SP8. Getting this right is the absolute key to an amazing developer experience for Native HANA.

 

HANA Studio has been nicely smoothed around the edges and I'd be very happy to develop large apps with large number of developers working concurrently. It feels extremely solid and it is much better integrated than other development environments. It feels like developer experience has been put to the top of the agenda. Deleting stuff is easy, for example!

 

Most of the effort in XS seems to have gone to make the experience more unified, with some additional features like validation and associations. I suspect that most of the work in SP7 went behind the scenes, paving for innovation in SP8. We will see.

 

HANA Modeler

 

There aren't major changes from a modeling perspective, and a few features I saw in beta phase didn't make it to the release. But, it feels faster because some things happen in the background, and they have worked on usability. Plus, they fixed a nasty bug in Data Preview where the dataset was always fully materialized. In practice, this makes working with large datasets much easier than ever before, which is excellent news.

 

Also I seem to see improved performance for filter pushdowns, which again makes working with large datasets an improvement. Anyone who saw my series of blogs on Global Warming will know that I had some fun getting amazing performance with billions of rows. My weather dataset seems to perform nicely better with SP7.

 

I'd have liked to have seen new aggregation types for AVG, COUNT DISTINCT, Weighted AVG but I guess those have to wait.

 

Instead, we have some much wanted bits of functionality like copy/paste, keyboard navigation and where-used. Yay. Decision Trees now open in a new modeler, which I haven't had a chance to use yet.

 

New: The Spatial Engine

 

Beta support for spatial objects was introduced in HANA SP6 and this is now fully supported in SP7. I haven't had a chance yet to delve into the Spatial Engine in detail, but it looks like the same functionality as SP6 based upon the reference guide. There is a lack of real-world examples for the Spatial Engine, but it promises to be extremely powerful because you can in-line spatial objects into regular tables and then use ESRI geospatial data for analysis against regular relational database tables.

 

There's no other platform on the planet that allows this as far as I know.

 

Massively Improved: Text & Search

 

I haven't had a chance to look at this in detail yet but I'm surprised by the improvements in the text engine. It is now possible to customize the text engine to do the analysis you need and there are substantial enhancements to full text search functionality. In addition there are major new Fuzzy Search improvements.

 

There aren't more core languages supported but focus has been put on Russian, Japanese and Simplified Chinese, for Social Media analysis.

 

Improved: Predictive Analysis Library

 

There are 7 new algorithms: Statistics (Univariate Statistics, Multivariate Statistics, Chi-squared Test for Fitness, Chi-squared Test for Independent, and Variance Equal Test), Partition, Support Vector Machine (SVM), Forecast Smoothing, Substitute Missing Values, Affinity Propagation and Agglomerate Hierarchical Clustering.

 

Multiple Linear Regression, Logistic Regression, Apriori and C4.5 Decision Tree and CHAID Decision Tree have some much-needed improvements like support of p-value for coefficient and missing value handling.

 

I have no doubt that SAP's KXEN acquisition has allowed a very focussed activity on what improvements were required here.

 

Massively Improved: Smart Data Access

 

Smart Data Access - HANA's federation engine - was first released in HANA SP6 and it was an initial release. It looks like with SP7, SDA is massively improved. This is now one of the jewels in HANA's crown.

 

- Support for Calculation Views including filter push-downs

- Oracle and MSSQL Support as well as generic ODBC support, for any database

- Support for Insert/Update/Delete - previously only SELECT was possible

- Caching with Hive

 

So this brings quite serious data federation scenarios to life.

 

What's really needed in Smart Data Access is temperature control - the moving of cold data out of HANA and into the remote data source. For EDW scenarios, you could bind together HANA and IQ and have a very cost-effective solution. Very exciting potential development for SP8.

 

Enterprise Readiness

 

In my mind, HANA became truly enterprise ready in SP6, and that's shown by the fact that SAP runs its own ERP system for all employees on the HANA platform. However, there were still improvements to be made and some of these have come in SP7.

 

- Improved Monitoring, Alerting and Tracing

- Death of the Statistics Server (leading to better resource usage)

- Storage Snapshots

- Automatic Backups for new Scale-Out Nodes and better support for Backup/Restore in HANA Studio

- Multi-tier System Replication, SSL, Zero Downtime and Compressed Log Transfer

 

This requires a whole blog to itself but it looks like almost any enterprise scenario is now supported. There is still work to do for SP8 - automated failover, encryption of log volumes and backups, active-active HA, query-able and writable standby databases, SNMP monitoring and a few other bits.

 

Final Words

 

As I've written this blog I've been reminded what an amazing, deep and wide application platform SAP HANA has become. I've only had 2 days using SP7 so far, so this is just an initial impression, but it feels like an extremely polished release - the best release so far. What I like the most is that there has been a clear focus on quality-first rather than features-first - for instance, there was rumors of a graph engine, but the graph engine doesn't appear in SP7 as far as I can see.

 

There is plenty of wood to cut for SP8 and I'm sure the HANA team is taking a brief breath of air, before they get on with the planning for the next release. Hopefully this blog helps them with a few places to focus :-)


The second heresy: Does SAP have to own the IaaS layer in the HANA Enterprise Cloud?

$
0
0

I recently blogged about porting Cloud Foundry to the HANA Enterprise Cloud (HEC).  As I was writing that blog, I began to think about other aspects of the HEC that might be interesting to explore.  This blog examines a different layer in the HEC and suggests that there is potential for the platform to move to the next stage in its evolution.

 

Architecture of the HANA Enterprise Cloud

 

Here is detailed diagram provided by SAP of the architecture of the platform.

 

image001.jpg

 

Let’s simplify it a bit:

 

image002.jpg

 

Before we simplify the architecture even more, it is interesting to note that customers also have the ability to purchase SAP HANA Infrastructure without the other Managed Service portions.  This service offering is available currently via three partners: SAP, AWS and Saavis.

 

If you associate this pure-infrastructure offering to our HEC architectural drawing above, you get this diagram:

image003.jpg

Let’s continue the simplification of the HEC architecture:

image004.jpg

In this blog, I’d like to focus on the interface between the Management / Orchestration layer and the underlying IaaS environment.

 

Lessons learned from Cloud Foundry’s Cloud Provider Interface

 

As I described in my last blog, the PaaS Cloud Foundry uses the Cloud Provider Interface to make it IaaS-agnostic.

A Cloud Provider Interface (CPI) is an API that BOSH (Foundry’s deployment installer and operations manager) uses to interact with an Infrastructure as a Service (IaaS) provider to create and manage stemcells and VMs. … A CPI abstracts an underlying virtualized infrastructure from the rest of BOSH, and is fundamental to Cloud Foundry's model for deploying and running applications across multiple clouds. [SOURCE] (my emphasis / addition)

The Orchestration layer in the HEC has the potential to perform a similar role.

 

This feature is already envisioned by SAP as depicted in this diagram:

image005.jpg

[SOURCE]

 

Note: Although the diagram only describes HANA systems, the Orchestration layer must also have the ability to deal with other non-HANA environments inasmuch as it already manages other virtualized servers that are necessary for application servers, etc.

 

Therefore, the Orchestration layer could theoretically support multiple IaaS’s

image006.jpg

 

Which IaaSs could possible be supported by the HEC?

 

Obviously, SAP’s own Cloud infrastructure is being used as a IaaS – let’s look at some other possibilities.

 

OpenStack

 

A few recent investments in this area are interesting.

 

An investment in Virtustream may be seen as being linked to OpenStack

 

VIrtustream fields two architectural cloud platforms — one is based on VMware vCenter and ESX and the other is the KVM-based Elastic Computing Platform (ECP) that came out of Virtustream’s acquisition of Enomaly two years ago. Virtustream is working to migrate the ECP option over to OpenStack in the next few years., a migration that the new money will help power.

 

Our goal is to offer the same services and service level agreement (SLA) atop VMware or Openstack architecture — you choose,” said Rodney Rogers, CEO of Virtustream. [SOURCE]

It is also critical to remember that VIrtustream was one of the very first partners to be certified on HEC.

 

An investment by SAP Ventures in Mirantis - the largest OpenStack systems integrator - also reveals some intriguing aspects.

Investment proceeds will be used to solidify Mirantis’ leading position in the OpenStack market by enhancing the engineering roadmap for Fuel, its tool for management and deployment of OpenStack clouds, and driving integration with technology and products from Mirantis’ partners.

 

Enterprise software giant SAP is investing heavily in cloud technology. Mirantis has already built an OpenStack cloud for SAP using Fuel and the companies will explore further use of OpenStack by SAP.

The ability to support OpenStack would allow SAP to deal with a criticism from Forrester analyst Stefan Ried that arose soon after the HEC was furst announced:

Although it’s good to have advanced management capabilities for large Hana environments, the mainstream large-scale cloud providers are evaluating or already following OpenStack. For example, Oracle acquired cloud management expert Nimbula so that it could build its future public and private cloud offerings on the open-source-based OpenStack. SAP is restricting its ecosystem to the certified hardware vendors and is trying to deliver the full software stack on bare metal on its own this time – this should give some cause for concern. [SOURCE]

AWS

 

As I mentioned above, AWS is also a provider for HANA Infrastructure. This service is available as a virtual appliance rather than a physical box. The architecture in this offer shows that it can also provide more complicated environments (multi-node) as well.

 

This support of AWS in the HEC would lead to this architecture:

image007.jpg

The Cloud Frame / Orchestration layer

 

Although the Cloud Frame / Orchestration layer plays such central role in the HEC, there is very little information regarding what it really contains.

 

A patent from SAP submitted in 2012 provides some more details:

In general terms, a "cloud frame" in accordance with principles of the present disclosure provides a hardware and software product delivery framework to deliver a range of products and services into an enterprise. A cloud frame supports an architecture of hardware nodes, applications (e.g., business software), and services to orchestrate the collaboration of hardware components and applications using a specific collection of hardware nodes to provide resources and services such as computing resources, storage resources, networking resources, and management resources.

 

These cloud frames provide a more uniform building block for orchestration purposes.

 

The limited description of the Orchestration layer currently publicly available reminds me of Mirantis’ Fuel framework (mentioned above):

Fuel is an open source deployment and management tool for OpenStack. Developed as an OpenStack community effort, it provides an intuitive, GUI-driven experience for deployment and management of a variety of OpenStack distributions and plug-ins. [SOURCE]

 

Perhaps, we’ll see some interesting collaboration in this area so that SAP’s orchestration layer is no longer a proprietary environment but rather one that reflects a more open-source approach.

 

Why is this topic important?

 

In my opinion, SAP shouldn’t necessarily be the one supplying the IaaS for the HEC.  The HEC should be IaaS agnostic.

image008.jpg

As long as the IaaS provider meets the necessary SLA-related requirements and provides the necessary functionality, then which provider is used is largely irrelevant.

 

Indeed, the ability to switch between IaaS providers would provide added benefits.

  • Competition between providers regarding price or other feature-sets
  • The ability to choose particular providers who are able to supply special features – higher SLAs, etc.
  • The ability to use multiple IaaS providers reduces dependency on a single provider

 

The main motivation, however, is that such a decision would allow SAP to concentrate on other aspects of the HEC. As the IaaS layer becomes more of commodity, SAP could focus on layers that are higher up the value chain (for example, AMS-related features – consulting, etc).  SAP already got out of the hosting business once before (in 2009 in an agreement with T-Systems), I’d hate to see it bogged down again in providing such services when it could exploiting its more application-related expertise where it has real competitive advantages.

Installation in SAP HANA SPS 07 - welcoming hdblcm(gui) to the family

$
0
0

In only a few short support stack (SPS) releases, the SAP HANA lifecycle management tool set has drastically grown to encompass a plethora of tools and features, including: installation, configuration, and update. Prior to the most recent SPS 07 release of SAP HANA, the unified installer was the tool available to streamline server and component installation, all at one fell swoop. For a straight-forward installation, the unified installer was a worthy installation tool, however when the need to customize individual components arose, the unified installer offered less than optimal flexibility.

 

Enter hdblcm(gui). The successor to the unified installer doesn’t have a snazzy name, but what it lacks in pomp and circumstance, it more than makes up for in functionality. The new SPS 07 installation tool is actually a pair of installation tools – hdblcm for CLI and hdblcmgui for GUI – which provides a unified interface to reduce installation complexity, but offers the flexibility to mix and match components and versions. They are positioned as part of the SAP HANA lifecycle management tools family, which includes seasoned favorites such as the SAP HANA lifecycle manager (HLM), hdbinst, and hdbsetup.

 

The introduction of hdblcm(gui) offers several advantages:

 

  • Component-wise installation– Individual components (SAP HANA: studio, DB, client, HLM, host agent, AFL, LCApps) can be installed or updated in combination with the server from a single interface. Likewise, all components can be installed at one time.
  • Improved configuration file– A plain text configuration file template with parameters set to their defaults can be generated and edited to be called
    during installation or update.
  • Improved interactive installation– Both as a graphical interface, by calling hdblcmgui from the command line, or as command line installation, with iteratively requested parameter entry.
  • Improved batch processing– Installation and update tasks can be called using command line options or the configuration file, without any additional input required.
  • Multi-host system installation from the graphical interface – It is now possible to install a multi-host SAP HANA system from the graphical interface. Multi-host storage and grouping options can also be configured.

  SAP HANA lifecycle management tool hdblcmgui

 

Both the hdblcm and hdblcmgui SAP HANA lifecycle management tools can be used to install an SAP HANA system in one of the installer modes, and with a combination of parameter specification methods.

 

Installer modes

 

Installation can be performed in one of the following modes:

  • Interactive mode (default) - Available for hdblcm or hdblcmgui. The person installing the system must enter parameter specifications iteratively until the installation process finishes.
  • Batch mode - Available for hdblcm. The installation accepts the default values for all unspecified parameters, and runs to completion without any additional input required.

 

Parameter specification methods

 

Installation parameter values can be entered in one or more of the following methods:

  • Interactively (default) - Using either command line interaction (hdblcm) or graphical interaction (hdblcmgui), most parameters are requested interactively.
  • Command line options - Installation parameters are given in their accepted syntax as a space delimited list after the program name (hdblcm or hdblcmgui).
  • Configuration file - The configuration file is a plain text file, of parameters set to their default values, that can be generated, edited, and saved to be called during installation with either the hdblcm or hdblcmgui tool.

 

configfile.png

 

More information

 

SAP HANA server installation

 

SAP HANA configuration

 

SAP HANA update

Beam Me Up Hasso

$
0
0

Can the iconic transporter provide insight into how next-generation enterprise software can transform your business?

 

One of the most recognized phrases to ever hit our collective psyche, "Beam me up" conjures up a vision of technology nirvana.  Alongside such marvels as the instant-healing medical tricorder and warp drive, the transporter represents a fundamentally different approach to interacting with the world around us.  However, in this fantastic future world, imagine for a moment the time before the transporter.

transporter.pngImage source:  Transporter room - Memory Alpha, the Star Trek Wiki


Most of us are not too keen on experiencing heavy turbulence on flights, but can you imagine the turbulent experience of taking a shuttle down through the atmosphere?  And relatively speaking, shuttling down to the surface from a starship was soooooo slow.  Transporters didn't just take the head-jarring atmospheric bumps away, but converted travel from a time-consuming affair into an essentially instantaneous experience.  And this in turn presented an amazing opportunity - how would life change as a result?  Herein lies an interesting corollary to the "real" world.

 

Today I live with my family on a few acres in the countryside.  While we treasure the star-filled night skies and quiet environment, it does take almost 30 minutes just to reach a highway.  As such, we frequently plan ahead even before the smallest of trips in order to efficiently combine multiple stops and errands and thus save time and gas.  I've imagined utilizing technologies such as jet packs or drones to address this challenge.  While I can certainly see advantages in the speed of travel, drawbacks such as inclement weather, hauling it around or crashing into a drone do tend to keep my excitement at bay. 

 

But a transporter - that would change the game.  Yes, it is faster, but it's how that speed would change my day to day life that excites me.  At first, my thoughts centered around how I could simply improve the way I approach tasks today, such as transporting to a store and back to pick up a few goods.  But a transporter can actually change the way I live, not just make it faster.  What if my purchased items were just transported to me instead? 

 

Unfortunately, technology has not quite advanced to the level of creating the still mythical transporter, but in accordance with Moore's Law, it has produced some incredible innovations.   Advances in hardware have result in very powerful systems with dramatically lower costs per unit.  As a result, a shift within enterprise architectures from slow, disk-based systems to "in-memory" computing is providing data-access time improvement multipliers measured in the thousands to hundreds-of-thousands with higher figures around the corner. 

 

However, SAP cofounder Hasso Plattner and his team realized that simply caching a chunk of data into memory wasn't the answer.  Such platforms needed intelligent software  to effectively utilize and optimize the potential that in-memory technology provides.  After all, hopping into a transporter without proper software could very well beam you into a wall or outer space.

 

Hasso realized the incredible opportunity at-hand and began to rethink what could be done with such a technology.  He questioned the status quo in many ways, asking "ludicrous" questions such as whether conventional, disparate transactional and analytics systems could be combined.  And through this innovative process, SAP HANA was born, combining database, data processing and application platform capabilities in-memory. 

 

Instant, intelligent access to enterprise data anywhere - SAP HANA is a transporter for enterprise data.


SAP HANA enables you to move from today's slow & bumpy shuttle-like architectures to a platform built on top of this enterprise data transporter, providing you the unique opportunity to rethink your business and industry ahead of your competition.  SAP is already re-imagining business, and one powerful example that realizes this potential is SAP Business Suite powered by HANA (BSoH).  BSoH provides native support of numerous business processes optimized for the HANA in-memory platform. While dramatically accelerating core business processes may seem the most obvious benefit, you will also uncover new growth opportunities and empower your people with greatly simplified business insights empowering real-time decision making.

 

Imagine…

 

…executing a fast financial close  with real-time visibility down to the lowest level of detail for all interested parties.

running Material Requirements Planning (MRP) in real-time, enabling you to react instantly to changes in demand and perform what-if scenarios in real-time to optimize decisions regarding reallocation and outsourcing.

…real-time analysis of security roles and their usage to uncover hidden access violations and opportunities to reduce role complexity.

 

When presented with such a radical new technology, it can be difficult to step outside of the daily grind long enough to imagine the possibilities.  But as we've seen so many times before, radical ideas today quickly become the norm and it is essential now more than ever to rethink.  SAP is questioning everything about business and exploring the art of the possible.  Join us on the journey.

 

Ping me if you'd like to ponder how an enterprise data transporter can change your business.

 

Beam me up Hasso.

 

Dwayne

dwayne.desylvia@sap.com

SAP HANA double-0 revision in "A secret present"

$
0
0

Let's do this quick and painless!

 

From SPS 7 (rev. 70.00) SAP HANA doesn't automatically create concatenated attributes for multi-column-joins anymore.

 

BANG!

 

There you go.

Just asked for it ( Looking for more info on CONCAT_ATTRIBUTEFree Space by Unload Columns or Alter Table for hidden concat_attribute columns... ) - just briefly after learning that these concat attributes were there in the first place -  and now you got it.

 

Let's see how much better the HANA world will be now.

 

Of course everything comes with a price for it.

Multi-column-joins that are executed without these concat attributes will take more time compared to the cases where a concat attribute can be used.

In order to have concat attributes in place when you want them, you can simply create indexes on the tables covering the exact set of columns.

 

For multi-column joins defined in any of the information model views the activation of the view will still create the creation of the concat attribute and that makes sense.

 

All in all the big problem with these concat attributes was that they had been created automatically and didn't require any privilege for that.

Even in production systems one could fire a SQL statement with a join that would in turn lead to the automatic creation of two relatively large data structures.

On top of that these concat attributes where difficult to find and remove.

With the current setup this cannot happen any longer.

 

Nice.

Once I have more time again, I will provide an update/follow up to Playing with SAP HANA that provides more details and explanations.

 

But that's all for now folks.

Now you know!

 

Have a great holiday season, cheers!

 

Lars

My Learning from TechEd Bangalore - Day 1

$
0
0

Hi Everyone,

 

It was a great week as I got to participate in TechEd. It was very special as I was attending my first TechEd.

So in this blog I would like to share my learning from Day 1.

 

The TechEd started with Keynote from Dr. Vishal Sikka.

20131211_095857.jpg

 

He explained the HANA Effect stating that HANA uses massive power of modern multicore processors and multicore machines, HANA is more than a database and has become a platform for SAP

He explained some of the statistics as shown below:

20131211_100643.jpg

He explained that when HANA was launched, it could perform 2 billion integer scans per second per core but now because of advances in both HANA technology and processors.from Intel, HANA is able to perform 3.5 billion integer scans per second per core. At present HANA is using Ivy Bridge processors from Intel and in the upcoming April next year Haswell processors from Intel will be available and then HANA would be able to perform 2 billion integer scnas per second per core.

 

He then went on to explain current Landscape that Customers use today as shown below:

20131211_101408.jpg

The current Landscape can be dramatically simplified by using different HANA Applications and HANA Cloud platform as shown below.

20131211_102123.jpg

Then Michael Reh came onto the stage and told that more customers are adopting HANA. You can learn more about HANA momentum from the below pic

20131211_102334.jpg

He then told that in India, there are more than 130 customers who are running HANA.

He then went on to explain new features in HANA SPS7.

You can learn more about What's New in HANA SPS7 from the below link:

http://help.sap.com/hana/Whats_New_SAP_HANA_Platform_Release_Notes_en.pdf

 

Then Jake Klein came on to stage to share more information on River Language

20131211_102848.jpg

He told that using RDL(River Definition Language), developers can focus on what they want from the application to do and not how an application runs.

He told that through their Pilot projects SAP has observed that with River, the code can be reduced by 5 - 10 times.to run an app.

He also told about dedicated SAP River site: SAP River - you can sign up here and experience coding in River.

 

After this Michael Reh made a great  announcement for OpenUI5.

20131211_103726.jpg

Many developers in the community have been writing about making SAP UI5 Open Source and finally SAP has launched OpenUI5

http://scn.sap.com/community/developer-center/front-end/blog/2013/11/20/reasons-why-sap-should-open-source-sapui5

http://scn.sap.com/community/events/teched/blog/2013/11/15/the-12-days-of-teched--my-wish-list-for-teched-bangalore

Learn What is OpenUI5 here: http://scn.sap.com/community/developer-center/front-end/blog/2013/12/11/what-is-openui5-sapui5

Learn more about OpenUI5 here: http://sap.github.io/openui5/index.html

 

After that Soumya showed a demo on SAP Lumira using IPL Cricket data.

20131211_104555.jpg

She showed that it is very easy to analyse data and share the analysis using cloud through Lumira.

You can learn more about What's New in SAP Lumira SP13 from here:

http://scn.sap.com/community/lumira/blog/2013/10/29/whats-new-in-sap-lumira-sp13

Also watch Lumira Tutorials from here:

http://scn.sap.com/docs/DOC-26507

 

Then Bernd Leukert came on to the stage and he told that before the Teched at Las Vegas, SAP had 450+ Suite on HANA Customers but in the last 4-6 weeks, 100 more customers have opted for Suite on HANA.

20131211_112142.jpg

At the end of the keynote, SAP Ganges - a business retail network was announced

20131211_114900.jpg

Learn more about SAP Ganges here: SAP GANGES CPG

http://scn.sap.com/community/business-trends/blog/2013/12/12/sap-ganges-helps-it-flow-to-retail-customers-that-most-vendors-ignore

 

You can watch the complete Keynote below:

SAP Executive Keynote Bangalore: Dr. Vishal Sikka | SAP TechEd Online

 

After that, I attended a Demo Session on "HANA Data Marts" where we were shown videos on different Data Provisioning techniques.

We were shown How to load data using Flat Files, Data Services and SLT. We were also shown how to use Smart Data Access. Most of these things I already knew

You can check my blog http://scn.sap.com/community/hana-in-memory/blog/2013/08/17/hana-reference-for-developers--links-and-sap-notes to get links to all the Data Provisioning technologies and learn more about them.

 

After that in the evening, I attended a lecture on "Security in Different SAP HANA Scenarios"

 

For all the figures shown below, the credit goes to SAP.

 

The below figure shows HANA Security Architecture

1.jpg

For Authentication/SSO we can use SQL Access or HTTPAccess(HANA XS):

We can log on to HANA through SQL access using Username and Password, Kerberos or SAML(Security Assertion Markup Language)

We can also log on to HANA through HTTP access using Username and Password(basic authentication and form based authentication), SAML, X.509 and SAP Logon Tickets.

We can also define Password policies like password length, password complexity etc.

 

For Authorization, we have different privileges built into SAP HANA like System Privileges, Analytic Privileges, SQL Privileges, Package Privileges and Application privileges.

 

For logging on to HANA, users and roles must exist in the Identity Store


SAP HANA provides SSL for communication Encryption and Data Encryption for Data Volumes on disk


SAP HANA also provides Audit Logging -  provides logging of configuration changes, user, role and privilege changes, data access logging.

We can also create our own policies for audit logging  and the audit trail is written in Linux Syslog


We can have the following three HANA Security Scenarios.


Scenario 1:

2.jpg

The above scenario is a 3-tier scenario like using BW on HANA or Business Suite on HANA.

In this scenario, the Security is generally implemented in Application Server and HANA functions are only used to manage administrative access to data.

 

Scenario 2:

3.jpg

The above scenario is Data Mart Scenario.in which we replicate data from Source system to HANA and then use reporting tools like BO to analyse data.

The security functions are taken care by HANA like privileges to consume HANA views, privileges for users or roles, managing administrative access to data.

 

Scenario 3:

4.jpg

The above scenario is a 2 tier architecture making use of light weight web application server present in HANA and reporting is done using browser based applications built using HANA XS.

In this scenario, security is completely handled by HANA like Authentication, Authorization, Encryption, Audit Logging and securing Web applications.

 

So we got to know about above three HANA Scenarios and how security is taken care of in these cases.

 

To learn more about HANA Security, read the following guides:

SAP HANA Platform – SAP Help Portal Page

Also read HANA Administration Guide:

http://help.sap.com/hana/SAP_HANA_Administration_Guide_en.pdf

 

Regards,

Vivek

My Learning from TechEd Bangalore - Day 2 - SAP River, Augmented Reality and HANA Developer Tools

$
0
0

Hi Everyone,

 

Earlier I shared my learning from Teched Day 1:

http://scn.sap.com/community/hana-in-memory/blog/2013/12/15/my-learning-from-teched-bangalore--day-1

In this blog, i would like to share my learning from Day 2.

I attended the first session at 08:30 AM on "River Based Development".

 

For all the figures shown below, the credit goes to SAP.

 

First we were explained Why River was introduced?

Generally we use 3 tier architecture in which we build data models from tables, views and use triggers.

Then we think of Business Logic  - how to manipulate data. After this we need security so that right people can access right data.

After that we build UI or mobile apps so that user can see the data.

This is a very complicated process as it involves use of many technologies like SQL, Java, ABAP, Odata, HTML5 etc.

 

To simplify this process, SAP River was introduced - it allows to build everything in one environment as shown through below pic.

1.jpg

The SAP River compiler compiles the River program and splits it into different objects present in HANA like OData, Stored Procedure, Tables, JavaScript etc.

As the data model resides entirely in database so there is no need of copying data.

 

So SAP River is a language to create:

  • data model
  • business logic
  • access control

Using River, Developers can focus on what the application does and no need for him to worry about optimization and configuration of tools.

RDL(River Definition Language) has rich data types and uses "E/R Modeling" to define data model

Entities that correspond to database tables.

Associations that describe relations between tables.

After that we were shown demo of "River Airlines Reservation System".

If we create a table without defining any Key then the River compiler generates a hidden field as key.

Once a River program is activated, OData Service is also created for River entities and actions and can bee seen as shown below:

20131212_094911.jpg

When we click on Data Preview an editor with support for CRUD(Create, Read, Update, Delete) operations is opened through which we can change records as shown below"

20131212_091039.jpg

We also have a Random Test Generator that can be used to generate random data based on Data Types and Associations (we can also specify our specific needs ) as shown below:

20131212_091225.jpg

As River doesn't support regular expressions so we can also use JavScript, SQLScript etc for functions that are not supported by River.


We can create business logic inside River using actions like Get Free Seats and use it as shown in the below figure:20131212_093028.jpg

We can see the result of action in the River Airlines Reservation System as shown below:

20131212_093909.jpg

SAP River is Database Independent language but at present it supports only SAP HANA.

If we try to delete any column from a table in River then the compiler checks if the table has data or not.

If the table doesn't have any data then the column is deleted and necessary changes are made in the other entitiess.

But If the table has data then compiler won't allow to delete that field in the table.

 

Will surely try out SAP River and see how it works.

 

You can learn more about River and watch the River Flight Demo from the River YouTube Channel:

http://www.youtube.com/channel/UCObfivfVKb9vNWuO1sBoiPQ

 

You can sign up at site SAP River for one week to try SAP River.

 

Also read the following documents:

Get the SAP River Guides form below link:

SAP River – SAP Help Portal Page

 

Introducing RDL – The River Definition Language:

http://www.saphana.com/community/blogs/blog/2012/11/15/introducing-rdl-the-river-definition-language

Introducing SAP River!!:

http://scn.sap.com/docs/DOC-47587

A first look at "River":

http://scn.sap.com/community/developer-center/hana/blog/2013/11/22/a-first-look-at-river

SAP River Tutorials:

http://scn.sap.com/docs/DOC-49281

 

After that I attended Technology showcase on "Augmented Reality with SAP HANA"

They had a demo showing how wearable technology  such as Google Glass, can combine with SAP HANA, to improve the productivity of field & warehouse workers.

 

The below video was being shown there:

 

They were letting people use Google Glass, scan bar code and get the details of the product as seen in the below pic:

DSC04326.JPG

I really loved seeing HANA being used with latest tech like Google Glass.

The only downside I see is Google Glass is very expensive - around 1500 USD

Learn more about Google Glass:

 

A video on Google Glass explanation by MKBHD:

 

After that I attended lecture on "Big Data, SAP HANA and Apache Hadoop"

They explained about What Big Data is, What Hadoop is and how SAP HANA can be used with Hadoop to solve Big Data Problems.

They also gave a brief introduction about Smart Data Access and how it can be used with Hadoop.

I knew most of the things covered in this lecture.

To learn about Big Data, you can check my blog:

http://scn.sap.com/community/hana-in-memory/blog/2013/10/13/big-data-facts-and-its-importance

To learn about Hadoop, you can check blog:

http://scn.sap.com/community/hana-in-memory/blog/2013/10/14/hadoop

To learn about HANA and Hadoop Integration, you can check blog:

http://scn.sap.com/community/hana-in-memory/blog/2013/10/15/b

 

Finally in the evening, I attended a Mini Code Jam session on "SAP HANA Developer Tools" by great Thomas Jung

DSC04349.JPG

He told us about new features in HANA Studio and HANA Web IDE.

The most important thing was the "removal of REGI".

So now HANA Studio is free from dependency on HANA Client and REGI

Earlier when we wanted to update HANA Studio, we had to update both HANA Studio and HANA Client so that we don't face any errors or issues but now there is a Install Update Studio option in HANA Studio and can update it without any issues.

Earlier operations like checkout were performed sequentially - one file at a time but now we can perform these operations in batches and it improves performance.

Earlier when we used to perform activation of multiple files at once, sometimes we used to get error even though the syntax  of our files was correct - it was because dependencies were not analysed properly as the objects were being processed on at a time. But now as activation can be performed in batches, and objects can be processed in batches, the dependencies can be analysed properly.

 

Now we can also perform Inactive Testing - we can test some of our objects without activating them.

 

Associations have been introduced in CDS to define the relationship between entities.

 

Now we also have a Job Scheduler in SAP HANA to schedule jobs of SQLScript Procedures and Server Side JavaScript.

We have a new object called .xsjob now.

The syntax used is very similar to CRON.

 

Now HANA Web IDE has four different roles that support four different tools as defined below:

sap.hana.xs.ide.roles::CatalogDeveloper

sap.hana.xs.ide.roles::EditorDeveloper

sap.hana.xs.ide.roles::SecurityAdmin

sap.hana.xs.ide.roles::TraceViewer

 

HANA Web IDE and HANA Studio have been brought close together and have almost same functionalities.

But one thing in which IDE triumphs over HANA Studio is support for Version Management.

 

To learn more about new features in HANA Studio and HANA Web IDE, check the below blogs by Thomas:

SAP HANA SPS07 - Various New Developer Features:

http://scn.sap.com/community/developer-center/hana/blog/2013/12/03/sap-hana-sps07--various-new-developer-features

What´s New? SAP HANA SPS 07 Web-based Development Workbench:

http://scn.sap.com/community/developer-center/hana/blog/2013/12/03/what-s-new-sap-hana-sps-07-web-based-development-workbench

 

Regards,

Vivek

HANA System Replication - Switching back and forth

$
0
0

Many people asked me what the correct order is when using hdbnsutil commands in the context of system replication.

 

If you have no clue what I am talking about you might want to press ALT+F4 or have a look into following how-to guide:

How to Perform System Replication for SAP HANA

 

Let me try to make it as clear as crystal...while keeping it simple.

 

ENVIRONMENT

A=first and preferred data center

B=second data center

PRIM=HANA system which acts as primary in system replication mode

SEC=HANA system which acts as secondary in system replication mode

 

INITIAL SETUP TO ENABLE SYSTEM REPLICATION BETWEEN A AND B

A> hdbnsutil -sr_enable --name=SITEA (HANA system started, A=PRIM)

B> hdbnsutil -sr_register --name=SITEB --… (HANA system stopped, B=SEC)

B> HDB start (system replication will start when B is completely started)

 

DISASTER IN A OCCURS

B> hdbnsutil -sr_takeover (B=PRIM)

now recovery of A is performed and when done next step

A> hdbnsutil -sr_register --name=SITEA --… (HANA system stopped, A=SEC)

A> HDB start (system replication will start when A is completely started, HANA will determine if a complete sync is necessary or a delta will do)

 

FAIL BACK

Now A=SEC B=PRIM and systems are in sync. You might want to fail back to the original setup A=PRIM and B=SEC for various reasons

A> hdbnsutil -sr_takeover (HANA system started, A=PRIM)

B> HDB stop

B> hdbnsutil -sr_register --name=SITEB --… (HANA system stopped, B=SEC)

B> HDB start (system replication will start when B is completely started)

yes, this is the only way to fail back, also if you ask me 10 more times ;-)

 

 

SIDE STORY: DISASTER IN B OCCURS WHILE B IS SEC

Not much to do, once B is recovered and restarted system replication will re-sync

 

 

Happy XMAS and bye for now,

Frank

 

PS: Executing this commands in the wrong order might lead into a HANA system which does not start anymore. So be careful! And I know what I am talking about after searching for hours to find the root cause and then figuring out it was created by myself.


The Battle of The Database Elephants

$
0
0

database-elephants-banner.jpg

stonebraker_x220

GigaOM Research recently published an interview with database rock star Michael Stonebraker on “the impending battle of the database elephants,” covering his thoughts on the disruption in the database market.

 

This blog includes the excerpts I thought were most interesting:

 

SAP enters the database market

“In the OLTP market, recent advances have completely convinced me that main memory database systems … are going to completely take over

 

“The database market is really alive, vibrant, with lots of new ideas, and I think the legacy vendors face the “innovator’s dilemma” in spades”

 

“SAP is in the database business and SAP customers are Oracle’s biggest customer right now, and among the elephants there’s going to be a duke it out between Oracle and SAP and I’m delighted to look on from the side.”

Legacy databases are obsolete

“I think data warehouses are an SQL market. It’s just there’s the new way to do it and the old way to do it, and the legacy vendors have the old way to do it. In OLTP, I think it’s a SQL market also, and the legacy vendors have the old way and there’s a new, much better way.

 

In round numbers, the database market is a third OLTP, a third data warehouses and a third everything else, and I think “everything else” is primarily a non-SQL market. I think in datawarehouses and in OLTP, it will remain a SQL market, it’s just the implementations have to change from what they are now to better ideas.

 

The codebases that the elephants, the legacy vendors are selling right now are 25 years old. And it’s time for them to be retired and sent to the home for obsolete software!

Modern ideas behind HANA

“My expectation is that SAP will make a compelling case for their SAP customers switching off of Oracle and onto HANA. That case has not been made yet, it’s way to early. The real thing to watch is how SAP customers are going to react to persuasion from SAP to switch database systems.”

 

I’ve looked at the ideas [behind HANA], and I think the ideas are good. They are modern ideas. It’s too soon to whether the implementation will hold up to the ideas. My suspicion is that it deserves to be taken seriously, and that it will have a very large elephant pushing it very hard”

Gap between NoSQL and SQL narrowing

“My favorite way to categorize the NoSQL guys is that they started off as “NoSQL,” meaning “SQL is bad.” After a while, that turned into NoSQL  meaning “not only SQL” – SQL was fine, and they wanted to co-exist with SQL systems. My prediction is that NoSQL will come to mean “not yet SQL.”


The two things the NoSQL guys say is number one, “don’t use SQL, instead use low-level record-at-a-time language.” Cassandra and Mongo have both announced what looks like – unless you squint—a high level language that is basically SQL. I think the NoSQL guys will move to putting higher-level languages on their products, and thereby make the difference between NoSQL and SQL get much smaller.

 

I also think that the second thing is that they don’t like ACID.  The biggest proponent of NoSQL non-ACID has been historically a guy named Jeff Dean at Google, who is responsible for most or all of their database offerings. And he and the team recently wrote a system called Spanner. Spanner is a pure ACID system. So Google is moving to ACID and I think the NoSQL market will move away from eventual consistency and toward ACID, and so I think the distinction between the two camps will decrease in the future.

 

“There’s been 40 years of DBMS research, starting way back in the 70s. This was a  huge debate in the 70s in the relational database research world, and if you go back and look at the history in the 70s, all the discussion today of ACID vs non-ACID all got wrangled out back then. The NoSQL engineers didn’t… You know, “If you don’t pay attention to history, you’re going to have to repeat it,” which I think is what’s happening.

Did Oracle take good care of TimesTen?

only-10-percent-useful-work“That’s a technical question… I and some others wrote a paper called “OLTP through the looking-glass, and what we found there.”

 

We took an open source legacy DBMS called Shore that is from university of Wisconsin. And we said “suppose all the data fits in main memory?” If you have a terabyte of data or less, or maybe even these days two or three or five terabytes, it’s perfectly reasonable to put that in main memory.

 

So we ran the industry standard benchmark, which is TPC-C, on data with a buffer pool big enough to hold all the data. And then we said “where do all the cycles go?”

 

The answers were a little bit shocking: less than 10% goes into useful work – meaning actually solving the SQL command that comes in. The other 90+% went to four different places [….]

 

TimesTen was architected in the 90s […] it’s got three of the four big pieces of overhead, and so its ability to go blindingly fast is really compromised. And so I think the question is not “has any particular system been well taken care of?” or not, it’s more than that any system written more than six or eight years ago didn’t realize where all the overhead is going, and wasn’t architected in a way that goes blindingly fast.”

The full interview

You can listen to the full interview on Google Soundcloud (The Stonebraker portion starts at 18:20)

My Learning from TechEd Bangalore - Day 3 Geolocation Data, RDS, HANA Live, HANA Cloud Platform

$
0
0

Hi Everyone,

 

Earlier I shared my learning from Teched Day 1:

http://scn.sap.com/community/hana-in-memory/blog/2013/12/15/my-learning-from-teched-bangalore--day-1

and Teched Day 2:

http://scn.sap.com/community/hana-in-memory/blog/2013/12/15/my-learning-from-teched-bangalore--day-2

 

In this blog, I would like to share my learning from Day 3.

I attended the first session at 08:15 AM on "Location Intelligence with Geospatial Data Services".

 

For all the figures shown below, the credit goes to SAP.


The session was very informative as I had no prior knowledge of what Geospatial data was.

The session started with What is Geospatial data and why it is important for us?

Gesospatial data is a data that has explicit geographic positioning information included within it, such as a road network from a GIS(Geographical Information System) - for example we can identify any position on the earth by latitude and longitude.

Why is this data important for us?

Although we are living in era of E Commerce doing online transactions, even then any business transaction like sales, delivery, maintenance, manufacturing etc. happen at some location.

There are many ways in which location data can be used, such as:

1.) By using the nearest location, we can find closest point of interest such as hotels, restaurants etc. - like we see Google Now showing us near by places of interest whenever we travel.

2.) Identify places for placing billboard advertisements such as we can see billboard advertisements near shopping malls and movie theaters.

3.) Use location data for Site location like opening of new retail store or for placing a new telecommunication tower(we can also minimize drop calls with this).

4.) Transportation can get benefit from Location data in a number of ways      :

a.) Increase delivery speed - by reducing the time to search for right delivery address and taking shortest route - for example When Flipkart makes a delivery, it can directly deliver to the address instead of asking address from people from the nearby shops or neighborhood - it will reduce searching time and fuel costs.

b.) Suppose a train is carrying goods to a destination but then there is a delay in delivery of goods due to any accident or natural disaster, then using the current location of the train we can inform distributors or people that there will be delay of exactly 'x 'time.

 

So using Location based intelligence we can improve revenue growth, operational efficiency and make better decisions.

 

After that, we were told the difference between "Address" and "Location"

What Address is? - Well, Address is generally a text string formatted according to definition provided by postal authority intended to direct mail delivery - like we have PIN Codes, House Numbers, Street Number but in some places like Villages we only have PIN Codes, we don't have any Street Number or house number - so such addresses used for mail delivery do not truly correspond to a location.

What Location is? - Well Location is more precise and describes a specific point. We can use Longitude and Latitude to to specify precise location on the earth.

 

So now the question arises how to get longitude and latitude data for a location? Well for this there is a process called GeoCoding.

GeoCoding is a process to convert geographical address data like Street Number, PIN Number to geographical coordinates( unique latitude and longitude) and then showing them exactly on the map. These coordinates can then also be embedded into digital media like photographs via Geotagging.

It turns an address(non-spatial data) into Spatial data and then appends the longitude and latitude of physical location of an address.

2.jpg

There are several methods through which GeoCodes can be derived like:

1.) Centroid Geocoding - less precise and provides coordinates of the Centroid(center of the area of point of interest) of a locality based on Postal Code

3.jpg

2.) Parcel Geocoding - most precise and most expensive and in this  an address is geocoded to a parcel rooftop, which is center point of an specific land called parcel - provides accurate latitude longitude values for our point of interest like accurate latitude longitude value for a House Number

5.jpg

3.) Address Interpolation - more precise than Centroid Geocoding and uses street data from locations where the street network is already mapped within the geographic coordinates. It then takes an address, matches it to a street and specific segment and then interpolates the position of the address, within the range along the segment.

4.jpg


To learn more about Geocoding techniques and comparison, visit the following link:

A comparison of address point, parcel and street geocoding techniques | Paul Zandbergen - Academia.edu

Geocoding 101 | Teradata Developer Exchange

 

After that we were given an overview of Geographical Data Services as shown below:

6.jpg

Check the below figure on how Geographic Data Services work:

7.jpg

For above GeoCoding, Data Services uses SAP Address Directories which is available at Service Marketplace - you need to subscribe it.

SAP is partner with NAVTEQ which provides GIS data. NAVTEQ was acquired by Nokia in 2007 and in 2012 Nokia changed its location offering name to HERE Maps.

 

As we all know that with HANA SPS6, Spatial Data Processing was announced.

To know more about Spatial Data Processing in HANA, chekc the belwo links:

http://www.saphana.com/community/blogs/blog/2013/09/19/spatial-processing-with-sap-hana

http://www.saphana.com/community/about-hana/advanced-analytics/spatial-processing

SAP - SAP Delivers Real-Time Spatial Data Analysis With SAP HANA®

 

Overview of Spatial Processing with HANA:

 

After that I attended a lecture on "Rapid Deployment Solutions for SAP HANA"

 

The session started with an overview of RDS.

What is RDS?

As said in the blog: Jeff Winter | Bluefin Solutions

RDS are modular packages that include SAP software, pre-configured content, best practices, and fixed scope implementation service, resulting in fast and predictable software deployments.

1.jpg

RDS can reduce the time and effort  needed to implement a SAP solution.

The below figure shows Available and Planned Rapid Deployment Solutions for SAP HANA

11.jpg

After that they gave a brief overview of HANA Live

HANA Live provides virtual data models (database views) in SAP HANA for easy analysis and consumption of Business Suite data.

Earlier it was called SHAF(SAP HANA Analytics Foundation)

It provides Calculation Views built over many SAP tables which are ware of business logic and customizing dependencies.

It provides 2200+ HANA Views

13.jpg

As shown in the above figure, it has Private Views, Reuse Views and Query Views.

Private Views:  encapsulate certain SQL transformations on one or several database tables or even other views. They are not classified as reuse views as they might not carry clear business semantics but are rather intended to be reused in other views. A private view may be based on database tables, other private views or on reuse views.

Reuse Views: are the heart of the Virtual Data Models. They expose the business data in a well-structured, consistent, comprehensible way covering all relevant business data in SAP Business Suite systems. They are designed for reuse by other views and must not be consumed directly by analytic tools.

Query Views: are top views in the hierarchy of Views. These are designed for direct consumption by an Analytic Application(for example based on HTML5) or a generic Analytic tool(for example SAP Lumira). The name of a Query View ends with Query. For example: CostCenterPlanActualCostQuery


Following figure shows HANA Live Browser Application:

20131213_102023.jpg

We can search views using browser and open content

20131213_102138.jpg

20131213_102207.jpg

We can open Query Views for reporting purposes on SAP Lumira as shown below:

20131213_102437.jpg

We can deploy HANA LIve Side by Side or as Integrated Stack as shown in the below figure:

12.jpg

It was a good session as I got to know more about HANA Live.

 

Check the video of SHAF for Business Suite:

 

Learn more about RDS:

What is the SAP Rapid Deployment Solution Implementation Methodology? Does It Work?:

http://scn.sap.com/community/rapid-deployment/blog/2012/08/06/what-is-the-sap-rapid-deployment-solution-implementation-methodology-does-it-work

What are RDS’s – SAP Rapid-deployment solutions?:

http://scn.sap.com/community/rapid-deployment/blog/2013/02/27/what-are-rds-s-sap-rapid-deployment-solutions

SAP HANA Rapid Deployment Solutions: Delivering quicker time to value:

http://www.saphana.com/community/blogs/blog/2012/03/08/sap-hana-rapid-deployment-solutions-delivering-quicker-time-to-value

Webcasts: SAP Rapid Deployment Solutions

http://scn.sap.com/docs/DOC-44505

 

Learn more about HANA Live:

http://www.saphana.com/community/blogs/blog/2013/08/22/sap-hana-live-is-generally-available

http://www.saphana.com/docs/DOC-2923

http://help.sap.com/hba

http://www.saphana.com/docs/DOC-2949

 

After that I attended a Demo Showcase on "Full Text Search, Fuzzy Search, Text Analysis and InfoAccess in SAP HANA"

DSC04411.JPG

In the Demo, we were first explained what Full Text Search, Fuzzy Search, Text Analysis is?

Full Text Search can be used to exploit unstructured data - for using full text search, we have to enable full text index.

On enabling full text index, two steps take place -

1.) File Filtering: Binary file types like .ppt, .pdf are converted into plain text

2.) Linguistic Analysis:

a.) Tokenization - decomposes word sequence e.g. - "quick brown fox" -> "quick" "brown" "fox"

b.) Stemming - reduces tokens to linguistic base e.g. - Ran -> Run

c.) Part of Speech Identification - eg. - quick - Adjective

This full text index that we create gets attached to the Table Column.

We can call Full Text Search by using CONTAINS() function in WHERE clause of a SELECT Statement such as:

 

SELECT * FROM DOCUMENTS
WHERE
CONTAINS(doc_content, 'ox')

Text Analysis is an optional process on top of Full text Search and provides abilities for linguistic markup, entity extraction, identifying domain facts and supports 31 languages.

The results of Text Analysis are stored in a table.

 

Fuzzy Search is a fast and fault tolerant search and finds strings that match a pattern approximately(rather than exactly) means it returns records even if the search term contains additional or missing characters or other types of spelling error.

Fuzzy search algorithm calculates a fuzzy score for each string comparison.

A fuzzy search score of 1.0 means the strings are identical whereas a score of 0.0 means the strings have nothing in common.

We can call Fuzzy Search by using CONTAINS() function with FUZZY() option in WHERE clause of a SELECT Statement such as:

 

SELECT * FROM DOCUMENTS
WHERE
CONTAINS  (doc_content, 'ox', FUZZY(0.7))

Fuzziness threshold can be manually set when calling a Fuzzy search using FUZZY() function as shown.

Default Fuzzy threshold is 0.8

 

HANA Info Access can be used to build UIs that provide read-access, standard search and simple analytics.

Only Standard Attribute Views are supported by Web based HTML UI built for a Web Browser using HANA Info Access

 

After this they showed a demo of Info Access mobile app on IPad.

The IPad app supports Analytic Views.

They showed us that using the app we can find How Sales is happening in Real Time and such other uses.

We can download this app for free from Apple Store:

SAP HANA Info Access on the App Store on iTunes

Check the following HANA Info Access Setup Guide for Ipad:

http://help.sap.com/hana/SAP_HANA_Info_Access_iPad_App_Setup_Guide_en.pdf

At present there is no Info Access app for Android as only IOS is supported.

To learn more about HANA Info Access, go to page 463 of HANA Developer Guide:

http://help.sap.com/hana/SAP_HANA_Developer_Guide_en.pdf

 

This session helped me to get a good overview of Search functionality in HANA.

 

Check the below links to know more:

 

Video on Fuzzy Search:

The not so fuzzy “Fuzzy Search”:

http://scn.sap.com/community/developer-center/hana/blog/2012/10/10/the-not-so-fuzzy-fuzzy-search

 

SAP HANA Text Analysis: Extracting insights from the written word:

http://scn.sap.com/community/developer-center/hana/blog/2013/05/14/sap-hana-text-analysis-extracting-insights-from-the-written-word

SAP HANA Text Analysis:

http://scn.sap.com/community/developer-center/hana/blog/2013/01/03/sap-hana-text-analysis

Text Analysis with SAP HANA and SAP Business Suite:

http://www.saphana.com/docs/DOC-3996

What´s New? SAP HANA SPS 07 Fulltext Search:

http://www.saphana.com/docs/DOC-4303

Check the Webinar on Text Analysis:

http://www.saphana.com/docs/DOC-4282

 

And finally I attended a Mini Code Jam on "My first App on SAP HANA Cloud Platform"

 

The session was taken by Rui Nogueira.

DSC04412.JPG

 

A brief introduction was given about HANA Cloud platform.

Let me give you all a brief overview of what HANA Cloud Platform is

HANA Cloud Platform is SAP's Platform-as-a-Service offering that enables us developers to build, extend, and run applications in the cloud.

14.jpg

Software as a Service – is the application that we want to run

Infrastructure as a Service - provides technical infrastructure to run the application

As Apps can't run directly on the technical infrastructure but need some enablement – provided by Platform as a Service

So Platform as a Service

Enables to run applications in the Cloud and leverages Infrastructure as a Service for this.

15.jpg

SAP HANA Cloud Platform is a Platform as a Service:

It offers an "Application Platform" that can run our on-demand apps in various programming models, most prominently HANA XS or Java.

It offers "Database Platform" that handles our data in-memory, row and column store and handles data related operations: Geospatial, Analytics, etc.

It also offers "Reuse Services" that are shared across all runtime models.

 

And then we created our account on HANA Cloud:

SAP HANA Cloud Platform

After that we created(used already created ) simple Hello World application, imported it  and published in the cloud and then added simple Authentication to it.

20131213_163125.jpg

The session was awesome and had some funny moments too.

 

Learn more about HANA Cloud Platform:

 

To learn more on HANA Cloud, check the following blogs:

 

Evolution of the SAP HANA Cloud Platform:

http://www.saphana.com/community/blogs/blog/2013/05/10/evolution-of-the-sap-hana-cloud-platform

Introducing SAP HANA Cloud Platform:

Platform as a Service | SAP HANA Cloud Platform | Cloud Computing Solutions | SAP

SAP HANA Cloud Platform - Content Overview:

http://scn.sap.com/docs/DOC-33139

SAP HANA Cloud Application Development Scenario End-to-End Tutorial:

http://scn.sap.com/community/developer-center/cloud-platform/blog/2012/11/20/sap-nw-cloud-application-development-scenario-end-to-end-tutorial

 

Also learn more about HANA Cloud Platform from OpenSAP:

https://open.sap.com/course/hanacloud1

Check the openSAP course guide - Introduction to SAP HANA Cloud Platform

http://scn.sap.com/docs/DOC-47509

 

8 Easy Steps to Develop an XS application on the SAP HANA Cloud Platform:

http://scn.sap.com/community/developer-center/cloud-platform/blog/2013/10/17/8-easy-steps-to-develop-an-xs-application-on-the-sap-hana-cloud-platform

Using HANA Modeler in the SAP HANA Cloud:

http://scn.sap.com/community/developer-center/cloud-platform/blog/2013/07/16/using-hana-modeler-in-sap-hana-cloud

 

Overall, It was a great TechEd experience.

 

Regards,

Vivek

Create multiple source systems in one Schema in SLT

$
0
0

By default, SLT lets you create a new schema for every source system connection. But if you are wondering how to connect multiple ECC source systems into one in SLT, you're stuck because there isn't much documentation on that. But how to do that?

 

Multi System Support

SAP let's you create several different connection possibilities of creating a schema for SLT:

Multiple Systems Support - SLT.png

 

If you want to report over multiple source systems in one report you can solve that by using views across different schema's, but that requires extra attention when you create a report. This is the reason why it is preferable to have one single schema in which there is only one uniform table, which gets it's data replicated by from multiple source systems. Each table usually has LOGSYS field to filter records if necessary.

 

Add a System to a current Schema

If you want to add a source system to a schema, first make sure that system is not yet used a other schema. This sometimes requires you to delete a schema, when your SLT is setup like the top situation in the first picture.

 

To add the free source system to a current schema, go to your SLT system and logon. Then go to transaction LTR where you can see and maitain your schema's. Click on "New" to start the wizard to create a new schema. But in stead of typing in a new schema name in the field "Configuration Name", use the currentschema name. In the "Description"-field you can type in any name as you like, but this also means that you may get confused because it will look like that it has two different schema's. My advise therefore is to use the same description as the current schema.

 

Key in the Source System credentials and in the next screen of the wizard the Target System credentials. When this is complete, the system correctly identifies what you want to do effectively, that is to add a source system. In the pop-up select "yes" to add it to the current schema. Enter other specific wizard settings to complete the configuration.

Use Existing Schema in SLT.PNG

This will then lead to one schema with multiple source systems within the SAP HANA studio. In this table data is replicated from both systems.

SLT - Select source system.PNG

 

I hope this will be usefull for SCN members and adds to the existing SAP documentation.

 

Kind regards,

 

Bastiaan Lascaris

SAP BI consultant for CGI.

New features in HANA SP07 - what's useful in the real world?

$
0
0

I've been using HANA SP07 for two weeks now, and delivered a training course on it in the interim. It's fair to say that we kicked it around, load tested it, and generally beaten it to death. I thought I'd share the things that HANA SP07 brings that matter most in the real world.

 

1) Data Engine Improvements

 

The data engine improvements are quite dramatic. There was a time when the choice between SQL, Analytic Views, Calculation Views, what engine you ran stuff in etc. really mattered. Much trial and error was required to build good models. Now, in SP07, we find that there are a more simple set of rules that you can apply for Best Practice model generation. Plus, if you get it wrong, you don't get punished with a 100x performance impact.

 

Plus... some things which ran really badly before, like joining row and column store objects, now perform surprisingly well. There has been a lot of quiet work done in the background, which you can see using the Plan Visualizer. This makes a huge difference in the real world!

 

Also it may just be me, but text search seems much faster.

 

2) Developer Experience

 

HANA SP06 was the first revision built for Developers, by Developers, and SP07 consolidates on that effort. The main things that matter...

 

- The tooling is much faster and more consistent in naming conventions

- There are lots of useful things like code completion and syntax correction, which makes development faster and less error-prone

- Improvements to UI Integration Services which mean building PoCs and Mock-ups can be done in hours

 

It's fair to say that there is plenty of work here to come - including even more consistency between development artifacts, but this is a step in the right direction!

 

3) Multi-User Development and Transport Management

 

This was very basic in HANA SP06 and you used to have to transport all of a delivery unit in one go. Now, you can easily have multiple users, developing on the same code. Highlights include:

 

- Inactive code testing

- Change management

- Repository management including version management

- Job Scheduling (which looks like a cron-script generator)

 

4) Smart Data Access

 

Smart Data Access is much improved in SP07 with support for Oracle, MSSQL, Sybase ASE and IQ, Teradata, Hadoop and generic ODBC connections. I've tested it a few times and it looks pretty handy. Plus, it supports Insert/Delete/Update so you could write jobs in HANA XS which move data from your hot store (HANA) to your cold store (IQ) overnight. This is the beginning of automatic data temperature management.

 

5) Monitoring

 

I was surprised here, but there are a bunch of things which make monitoring better in the real world. Such small pieces of usability are much appreciated!

 

- Improvements to the Data Preview button. Much faster and better SQL is generated.

- Expensive Statements trace is much faster for some reason

- New monitoring views in HANA Studio (right click the system to see them)

- Ability to see failed SQL Plan Visualizations

 

6) Modeling

 

There's not so much in the modeler, and they haven't done COUNT DISTINCT, AVG or WAVG yet (hint hint!) but there are a few neat new things.

 

- Star Join capability in the output node of the Calculation View modeler. This makes certain types of view much easier to build, as you could previously only join two tables at a time.

- Much improved usability with propagation of objects through models and code completion for expressions

- Much improved performance - huge improvement here

- Copy/Paste!!! Unfortunately not inside the Calculation View modeler :-(

 

7) Maintenance Revisions

 

These allow you to maintain the latest revision of HANA in your project track, and the latest revision of the last SP of HANA in your production track, whilst retaining security fixes. They are very useful in practical deployments of HANA - read more here.

 

There are also a few things which still don't seem to be complete enough to be highly usable:

 

Core Data Services

 

Core Data Services is a mechanism that allows building a whole data dictionary for an application in one development artifact. You can define types, tables, associations and it will build the model for you. But, it is still too limited to use substantially in the real world and doesn't support HANA Information Views.

 

Web IDE

 

There is a new Web IDE in SP06, enhanced in SP07. It feels like a collection of disjointed tools, which it is. It is very useful for Transport Management and a few other things like reliably deleting repository objects, but there's no way it could be used as a Cloud development environment in its current state.

 

Spatial

 

It feels like the Spatial Engine needs some work and there are no concrete examples to test and work with, and my testing couldn't get models working. Probably this is my lack of knowledge.

 

AFL Modeler

 

This still seems to generate script that generates tables, which means the AFL Modeler isn't really usable in the real world. It would be cool if SAP used its KXEN people to help make the AFL Modeler a success.

 

 

Conclusions

 

I find HANA SP07 a very pleasing incremental release. I think it will go down as the release that made HANA ready for the developer community to create large-scale projects in anger, and also as the release where SAP stopped cramming so many new features, and made what was there better, more mature and focussed on developer productivity and usability.

 

The development team should be proud with what they have created. There are a few parts of SP07 which feel a bit rough around the edges, but I'm sure they will be smoothed in the next few revisions.

 

It also very clearly lays down the foundation for what needs to come in SP08. But more of that in another blog.

SAP HANA SP07 Installation

$
0
0

Hello Guys,

In few steps i want to share with you my 2nd simple HANA installation, previously i installed HANA SP05 and yesterday only i installed HANA SP07, Fresh installation on a SUSE Linux 11.2

 

1. Start the installation, it will check your hardware. if there is a hardware incompatibility it will through an error. you still can skip the hardware check Process

1.JPG

2. The second step, you need to define the installation path, HANA System ID and the instance number

2.JPG

3. here you need to define the instance Admin password

3.JPG

4. Location of Data and Log files

4.JPG

5. User System password

5.JPG


6.Review and Confirm your installation, you will get a summary about your input

6.JPG

7. Installation Software Progress

7.JPG

8. Creating the HANA System

8.JPG

9. You have successfully installed SAP HANA

9.JPG


I hope this was helpful to anyone who is new to SAP HANA and wants to do the installation by himself.


Regards

Amr Salem

Viewing all 927 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>