Quantcast
Channel: SAP HANA and In-Memory Computing
Viewing all 927 articles
Browse latest View live

What’s in for You? – Get an Overview on SAP HANA Activities during SAP TechEd && d-code 2014

$
0
0

SAP TechEd && d-code 2014Las Vegas is only couple days ahead of us. The Berlin Event follows mid of November.

Be exited to meet the SAP HANA Experts of the SAP HANA Development and Product Management!

 

Get comprehensive insights into sessions around SAP HANA and do SAP TechEd && D-code your way!

 

Take part in expert-led exercises and classroom trainings; deep dive into SAP HANA platform topics during demo-rich lectures, review future product directions and chat with the SAP HANA experts in person!

 

Activities covering SAP HANA platform topics are spread during different tracks.

Find the most interesting sessions based on your needs by reading the SAP HANA activities blog posts clustered via tracks below:

 

 

And don’t forget to checkout the30-min Expert Networking Sessionsto talk with SAP HANA conference speaker in a smaller audience.

 

Looking forward meeting you,

 

Kathrin


Copying users in SAP HANA

$
0
0

Hi all,

 

as most readers will be aware, database users in SAP HANA are non-copiable entities. The most important reason for the lack of a generic user copy mechanism is in the course of a user copy, all privilege and roles assignments shall be copied as well. While this is not a real challenge for role grantings, the privileges granted directly to end-users are what makes a generic user copy mechanism impossible.

 

As most simple example, consider that you have two users, Jim and Bob. Jim owns a database schema named "JIM", and for whatever reason has given Bob the permission to work in that schema, that is, Bob has object privileges sucha s SELECT on schema JIM.

 

Now assume another database user, let's call her SYSTEMA wants to copy user Bob. In that process, she would also need to assign the SELECT privielge on schema JIM to the new user - but does SYSTEMA have the required privileges? She probably won't, and thus the process of copying the user would fail.

 

In the SAP HANA studio, there is since about SPS 7 a possibility to copy database users - with one restriction: the only user properties that will be copied are the assignemts of repository roles (aka design-time roles). If you are operating a well-managed HANA system, this will be fine, because SAP's recommendation is to only grant repository roles to end-users; do not make use of catalog roles; and never grant privileges directly to end users.

 

The copy process is purely a UI functionality, and thus cannot be automated.

 

If you need to automate copying of users, you might find the procedure below helpful. It allows copying a user to a single target user. The copying fill fail under the following circumstances:

  • user to be copied does not exist
  • user to be created does already exist
  • user to be copied carries catalog roles (except for PUBLIC)
  • user to be copied has direct privilege assignments (except for privileges on the user's own schema or on other schemas owned by the user)

 

The procedure is prepared as a repository object - you can basically paste the source code into the procedure editor of the editor in the SAP HANA studio or the Web IDE. Then you'll have to replace in the procedure header the <schema> (that's the database schema into which the activated version of the procedure will be placed) and the <package1>.<package2> repository path to your procedure. And you will probably want to change the dummy password that we give to the user.

 

Have fun,

Richard

 

 

PROCEDURE "<SCHEMA>"."<package1>.<package2>::copy_user"  ( IN source_user VARCHAR(256), new_user VARCHAR(256) ) 
 LANGUAGE SQLSCRIPT
 SQL SECURITY DEFINER
 AS   -- SQL statement we're going to execute when creating    v_statement VARCHAR(1024);   -- variable used for validation queries   found INT := 0;
BEGIN    -- get all repo roles granted to the user:    declare cursor c_cursor FOR select role_name, grantee, grantor from granted_roles where grantee=:source_user and role_name like '%::%';    -- prepare error handling in case of invalid arguments    DECLARE USERNOTEXIST CONDITION FOR SQL_ERROR_CODE 11001;    DECLARE USERALREADYEXIST CONDITION FOR SQL_ERROR_CODE 11002;    DECLARE WRONGROLETYPE CONDITION FOR SQL_ERROR_CODE 11003;    DECLARE PRIVSGRANTED CONDITION FOR SQL_ERROR_CODE 11004;    DECLARE EXIT HANDLER FOR USERNOTEXIST RESIGNAL;        DECLARE EXIT HANDLER FOR USERALREADYEXIST RESIGNAL;    DECLARE EXIT HANDLER FOR WRONGROLETYPE RESIGNAL;    DECLARE EXIT HANDLER FOR PRIVSGRANTED RESIGNAL;               -- check input parameter source_user:     -- does the user exist?    SELECT COUNT(*) INTO found FROM "USERS"      WHERE "USER_NAME" = :source_user;     IF :found = 0 THEN               SIGNAL USERNOTEXIST SET MESSAGE_TEXT =           'Source user does not exist: ' || :source_user;    END IF;        -- check input parameter new_user:     -- does the user exist?    SELECT COUNT(*) INTO found FROM "USERS"      WHERE "USER_NAME" = :new_user;     IF :found > 0 THEN               SIGNAL USERALREADYEXIST SET MESSAGE_TEXT =           'New user already exists: ' || :new_user;    END IF;        -- check roles granted to source user. We can only copy repository roles (containing ::)    -- the only allowed catalog role is PUBLIC    SELECT COUNT(*) INTO found FROM GRANTED_ROLES        where GRANTEE=:source_user and ROLE_NAME != 'PUBLIC' and ROLE_NAME NOT LIKE '%::%';    IF :found > 0 THEN               SIGNAL WRONGROLETYPE SET MESSAGE_TEXT =           'There are catalog roles (other than PUBLIC) granted to the source user ' || :source_user;    END IF;              -- check that there are no privileges (not roles) granted directly to the user - except for    -- privileges granted directly by SYS -> these would be privileges on the user's    -- own schema, or on schemas that the user has created.     -- If the user has any privileges directly granted (grantor != SYS), we will not copy    SELECT COUNT(*) INTO found FROM GRANTED_PRIVILEGES        where GRANTEE=:source_user and GRANTOR != 'SYS';    IF :found > 0 THEN               SIGNAL PRIVSGRANTED SET MESSAGE_TEXT =           'There are privileges granted directly to the source user ' || :source_user;    END IF;          -- create the new user with dummy password BadPassword1    v_statement := 'create user ' || :new_user || ' password BadPassword1';    exec v_statement;        open c_cursor;    -- and grant all the roles.    for ls_row as c_cursor DO      -- assemble grant statement for role in current loop:      v_statement := 'call grant_activated_role ( '''|| ls_row.ROLE_NAME ||''', '''|| :new_user ||''')';      -- and grant role:      exec v_statement;    END FOR;       
END;

SAP Certified Support Associate: SAP HANA - by the SAP HANA Academy

$
0
0

Introduction

Interested in getting certified as official SAP HANA supporter? In this blog, I will explain how you can prepare for the SAP Certified Support Associate - SAP HANA certification exam.

 

This blog is part of a series.

 

About the Certification

The Support Associate certification was introduced with SPS 06 (June 2013). Tim Breitwieser blogged about it at the time on SCN:SAP HANA Certification Program extended with "SAP Certified Support Consultant". In this blog he explained that the exam is "a pre-requisite for VARs to become authorized to deliver support for SAP HANA". For those that have access to SAP PartnerEdge, you can find all information in the Overview Presentation: VAR-Delivered Support for SAP HANA.

 

Topic Areas

There are 80 questions divided over ten topic areas. Cut score is 60%, which means that you need to answer at least 47 questions correct. Below the different topics and their relative weight. See the C_HANASUP_1 page on the SAP Training and Certification Shop for the specifics.

 

When I took the exam, I got 13 questions each on the >12% topics (Performance and Backup & Recovery), 8 questions on each of the 8%-12% topics, and only 4 questions on the two  < 8% topics. Your mileage may vary but when you prepare for the exam, it is best to focus on two big topics first as they account for roughly a third of all questions.

Screen Shot 2014-10-13 at 15.38.49.png

Resources

As with the certification E_HANATECxx, the main resource is the SAP Education training HA200 - SAP HANA - Installation & Operations. This 5-day training covers all the topics mentioned above.

 

The other main source of information for the exam can downloaded free of charge from the the publicly available SAP Help Portal (help.sap.com):

 

Performance:

 

On the PartnerEdge portal, the curriculum is a bit more extensive. It includes both HA100E and HA200R and also the two support processes e-learnings (12 minutes and 30 minutes), plus some additional guides from the Help Portal.

Screen Shot 2014-10-10 at 14.28.56.png

From my experience, most of questions came from topics covered in the Administration Guide (or the HA200 training). The guide is a hefty 600+ pages, so this is where I would focus on.

 

Additionally, I can strongly recommend reading the SAP HANA Performance Analysis Guide. This guide is not listed in the resources as it was introduced with SPS 07, whereas the exam is on SPS 06, but it provides good insights on how to tackle performance issues (13 questions).

 

Note that the SAP Help Portal on http://help.sap.com/hana_platform only shows the latest documentation, SPS 08 at the time of writing. For SPS 06 and SPS 07, you need to go to the SAP Service Marketplace on http://service.sap.com/hana.

 

 

Practise, practise, practise

Just training or reading the documentation, however, will not be enough. You will need hands-on experience, in particular with SAP HANA studio. A large number of questions will test your familiarity with the different views, tabs and subtabs of SAP HANA studio. You are expected to know the names and navigate in the dark through the Administration Console. So close your eyes and tell me how to configure a certain parameter to its default setting.

 

Got it? Click the Configuration tab, then double-click the parameter, then ...

 

DeltaMergeConfiguration2_thumb_750x9999.png

 

SAP HANA Academy

To help you prepare for the exam, I have created a playlist for the exam:

 

The following playlists may provide additional background:

 

Sample questions

On the certification page, a link to a PDF with sample questions is included. Below I marked the answers in bold and included a reference to the source with some tips and to do's.

 

==

 

1. You are supporting an SAP Netweaver BW Powered by HANA system that is in productive operation. Inadvertently, a table was deleted. The last data backup was three days ago. You need to recover the system to the point before the table deletion. What do you have to do accomplish this?

 

a. Restore the SAP HANA database from backup and apply the logs to recover to a point in time.

b. Reinstall the SAP HANA database and recover the database from the last known backup.

c. Recover the database from the last known backup and reload the deleted table.

d. Drop the table in the SAP HANA database and import the table from another system.

 

Source: Point in time recovery (PITR) is documented in the Backup and Recovery section of the SAP HANA Administration Guide ( SPS 07: 4.2.13 - Recovering the SAP HANA Database).

 

In theory, you would only need to install the SAP HANA server - answer b - if the disk that contained the software no longer works. However, in real life, the data volume would typically be stored on a RAID volume protected against the failure of a single disk drive.

 

Backup is a 2-step process. First we restore the data backup, then we apply the logs to recover the database to the time of failure or to a desired point in time. Answer c is wrong as once we have recovered the database, we are done. There is no need to reload any data.

 

Answer d is an option but typically not the right one as importing a table from another system may have disastrous consequences.

 

To do: Read the SAP HANA Administration Guide on backup and recovery and try it out yourself (delete a data file, delete a log file, etc.). Backup and recovery is an important topic in the exam so you need to be familiar with it.

 

 

==

 

2. Your client tells you that the SAP HANA database backup is not working. In the SAP HANA studio, where could you start to investigate the issue? There are 3 correct answers to this question.

 

a. _SYS_BI schema

b. SYS schema

c. Administration editor - System Information

d. Administration editor - Performance overview

e. Administration editor - Alerts

 

Source: As above, Backup and Recovery section of the SAP HANA Administration Guide.

 

A tricky question, as to be honest I have no idea why you would want to start to investigate this issue with the SYS schema. However, when there are 3 correct answers, that means there are 2 wrong ones: 

  • _SYS_BI is a schema internal to the database, this is an even more unlikely candidate for investigation.
  • Administrator editor - Performance does not provide backup information. In fact, there is no Performance overview view or tab to start with.

 

Alerts is always a good place to start investigations and the System Information overview tab shows the latest alerts

ViewingPerformance_1_thumb_800x9999.jpg

 

To do: Same as with question 1, read the guide and DIY.

 

==

 

3. In an expensive statement trace configuration, you want to identify queries that run longer than two minutes. Which value do you enter?

 

a. 12 000

b. 120

c. 120 000 000

d. 120 000

 

This kind of question is typical for the exam. Unless you have configured expensive trace a few times, it is not very likely you would know this.

 

As you can see in the SAP HANA Academy video below, trace thresholds are set in microseconds, that is one millionth of a second.

 

To do: Activate the expensive trace on your test database and investigate the output. This is documented in the SAP HANA Administration Guide here.

 

 

==

 

4. You have selected SAP HANA as product in SAP Solution Manager and checked the prerequisites. What are the next steps to configure SAP HANA as a managed system in SAP Solution Manager? Please choose the correct answer.

 

a.  1. Assign diagnosis agents.

    2. Create logical components.

    3. Enter system parameters

    4. Enter landscape parameters.

    5. Check configuration.

 

b.  1. Check configuration.

    2. Create logical components.

    3. Assign diagnosis agents.

    4. Enter system parameters.

    5. Enter landscape parameters.

 

c.  1. Assign diagnosis agents.

    2. Enter system parameters.

    3. Enter landscape parameters.

    4. Create logical components.

    5. Check configuration.

 

d.  1. Assign diagnosis agents.

    2. Check configuration.

    3. Create logical components.

    4. Enter system parameters.

    5. Enter landscape parameters.

 

 

Source: This is documented in the presentation by Active Global Support -HANA Supportability and Monitoring Setup (SAP Service Marketplace account required).

 

See also SAP Note 1747682 - SolMan 7.1: Managed System Setup for HANA

 

19.PNG

 

 

To do: Maybe I was lucky, but there was only a single question about Solution Manager when I passed the exam.

 

If you want to know exactly how this is setup, see

 

==

 

5. What are the purposes of executing a delta merge operation in an SAP HANA database? There are 3 correct answers to this question.

 

a. To move updated records from delta storage to column store

b. To move new records from SAP ECC tables to delta storage

c. To move merge data from row tables to column store

d. To move inserted records from delta storage to column store

e. To move deleted records from delta storage to column store

 

Source: This is documented in section 2.6 Managing Tables - The Delta Merge Operation of the SAP HANA Administration Guide.

 

You cannot merge data from row tables to column store - these are different engines - and delta merge has nothing to do with SAP ERP.

 

To Do: If you want to get a good understand about delta merge and the inner working of SAP HANA, I can strongly recommend the OpenHPI course In-Memory Data Management  (2014) - Implications on Enterprise Systems. Bonus: Mr. Hasso Plattner is teaching with guest performance of Mr. Bernd Leukert.

 

==

 

6. An ABAP program in SAP ECC is being optimized for the SAP HANA database. You have been asked to identify expensive SQL statements of this program that run for longer than one second. What do you have to do to identify these expensive SQL statements? There are 2 correct answers to this question.

 

a. Filter expensive SQL statements by DB user

b. Enable expensive SQL statements tracing

c. Set the trace level

d. Set the threshold duration

 

Source: This is documented in section 2.5.8 - Monitoring System Performance and 2.10.3 - Configuring Traces of the SAP HANA Administration Guide.

 

The mention of ABAP and SAP ECC is not relevant here.

 

You set the trace level for the SQL Trace, not for the expensive statement trace.

 

trace.pngTraceConfiguration2.png

 

To Do: As mentioned under resources, read the SAP HANA Performance Analysis Guide on this topic.

 

==

 

7. In the SAP HANA studio, which of the following enables you to identify the memory consumption of loaded tables?

 

a. System Information tab of the Administration editor

b. SYS.M_TABLES

c. Load subtab of the Performance tab of the Administration editor

d. SYS.M_CS_TABLES

 

Source: For what purpose you can query the M_CS_TABLES view is documented in section 2.6.4 Data Compression in the Column Store of the SAP HANA Administration Guide, albeit on the topic of data compression.

 

SELECT SCHEMA_NAME, TABLE_NAME, ROUND(SUM(ESTIMATED_MAX_MEMORY_SIZE_IN_TOTAL)/1024/1024/1024) AS "SIZE IN GB"

  FROM M_CS_TABLES

WHERE SCHEMA_NAME = <SCHEMA_NAME>

GROUP BY SCHEMA_NAME, TABLE_NAME

ORDER BY TABLE_NAME

 

Personally, I would argue that answer a is equally correct because, as you can see on the print screen below, the System Information tab of the Administration Editor includes a report that Shows memory consumption of schemas (loaded tables).

 

systeminformation.png

 

The Load sub tab allows you to display a performance graph on different counters but does not help you to identify the memory consumption of loaded tables

load.png

 

To do: Familiarise yourself with section 2.6 on Managing Tables of the SAP HANA Administration Guide and study the views of the System Information tab.

 

==

 

8. Which of the following columns are displayed in the Merge Statistics system report? There are 3 correct answers to this question.

 

a. TYPE

b. PART_ID

c. STATEMENT_STRING

d. MOTIVATION

e. CONNECTION_ID

 

Source: Documented in the M_DELTA_MERGE_STATISTICS - SAP HANA SQL and System Views Reference - SAP Library. Below a print screen of the report definition.

 

The STATEMENT_STRING and CONNECTION_ID columns are part of the view M_EXPENSIVE_STATEMENTS.

 

This question evaluates experience. Anyone familiar with monitoring merges and analysing expensive statement would know this. You are not expected to learn all the view definitions by heart. You are expected to be familiar with the most common support activities.

Screen Shot 2014-10-13 at 13.36.48.png

 

To do: Same as for question 7, above.

 

==

 

9. How can you improve performance of SAP HANA information models?

 

a. Use filters at table level instead of analytic views.

b. Use the JOIN operator instead of the UNION operator in calculation views.

c. Use CE_FUNCTIONS instead of SQL statements in calculation views.

d. Use calculated columns in calculation views instead of calculated measures in attribute views.

 

Source: Topic discussed in the two e-learnings SAP HANA Support Processes: SAP HANA - Execution of Models and SAP HANA Support Processes: SAP HANA Studio - Modeling Components.

 

To do: This is certainly an advanced topic where practice makes perfect. If you get a chance, attend SAP TechEd (d-code); session DMM270 - Advanced Data Modeling in SAP HANA addresses this topic.

 

Screen Shot 2014-10-13 at 14.56.54.png

 

==

 

10. In a scaled-out, high-availability environment for an SAP HANA database, how can you monitor the status of the hosts in the cluster?

There are 3 correct answers to this question.

 

a. 1. In the SAP HANA studio, right-click to add a new system.

    2. Create an entry for each of the hosts in this environment.

    3. Log into each of the hosts.

 

b. 1. In the SAP HANA studio, create an entry for a standby host.

    2. Verify that all SAP HANA database processes are running.

 

c. 1. Log into the Linux operating system of the SAP HANA appliance.

    2. Run command ifconfig.

 

d. 1. In the SAP HANA studio, navigate to the Landscape tab.

    2. Select the Services subtab.

    3. Check the Detail column.

 

e. 1. Log into the Linux operating system of the SAP HANA appliance.

    2. Launch the SAP HDB Admin console.

    3. Navigate to the Management Console tab.

    4. Verify that all SAP HANA database processes are running.

 

Source: This is a bit of a curious question as the HDB Admin tool is a tool used internally by SAP Support and is not publicly documented. You should not find any reference about this tool on the exam, and if you do, you should report it.

 

Below a print screen of answer d. Note that the detail column displays which service is master. To monitor the status (running, initializing, stopped) you would check the Active column. Whether this answer is correct is debatable.

 

With the ifconfig command in UNIX/Linux you can configure the network adapter. This answer is incorrect.

 

A standby host can be part of a high-available architecture but this is not a requirement. Hence, also not a good answer.

 

Adding each host of a distributed system, as a SAP HANA "cluster" is normally referred to in the documentation, might be correct but would be unusual.

 

Screen Shot 2014-10-13 at 14.04.11.png

SAP HANA Studio

Screen Shot 2014-10-13 at 14.02.12.png

HDB admin

 

To do: Read the sections on High Availability and Scaling SAP HANA in theSAP HANA Administration Guide.

 

 

More Questions?

Unfortunately, but very reasonably, those that have passed the exam are not allowed to share the questions with anyone else, so I can't share any particular question with you here. However, a close study of the mentioned resources should provide you with enough knowledge to successfully pass the exam.

 

Additional Resources

 

Did you succeed?

 

Feel free to post a comment about how the exam went. If there is any information missing, please let me know.

 

Success!

In Memory Technology is here to Stay: Five things to Consider

$
0
0

Remember back in 2010, in memory computing was the “thing.”  Flash forward and today, every traditional database vendor provides some kind of in memory capabilities in their disk based database. It’s clear that in memory database is out of the hype stage and has become a necessary technology for today’s business to achieve competitive advantage.

 

Being in the business for 17 years with 3 years of focus on in memory, I see all kinds of questions from customers, prospects and analysts. Based on numerous conversations, here are the top five key things to look for when selecting an in memory database. These are not in priority order.

 

In memory database that can

 

1) Do both transactions and analytics on a single copy of data: Separate databases for transactional and analytical applications will require additional IT systems to move data from transactional database to analytical database. This also increases the time it takes to see the results from the time actual
transactions occurred. Unless the database can do both transactions and analytics in one database instance, it is not possible to get real-time insights as transactions happen. Traditional disk based database with in-memory option keeps data in row format for transactions and keep some data in columnar
format for analytics. This approach is better than separate databases for transactions and analytics but still does not give real-time results and
requires more system resources.

 

2) Keep all the data in memory unless otherwise specified: Data must be in memory to get results in predictable time. The time it takes to get results for unplanned queries really depends on the number of tables in memory / in disk that are required to complete the query. Columnar store, advanced compression techniques, the ability to process both transactions and analytics from single copy of data allows in memory databases to keep more data in memory. In memory columnar data bases does not requires indexes and materialized views to deliver results quickly. This saves more memory and helps to keep more data in memory. Unless all data is in memory it is not possible to drill down into granular details from the high level dashboards to get complete picture of the business. Having the complete data set in memory also helps to predict future more accurately and quickly. Real-time insights for unplanned queries and ability to quickly go to the details is not possible with the database that keeps data in disk by default and only if instructed keeps data in memory.

 

3) Process more business logic and advanced analytics close to the data: When traditional databases are originally created in 70s, due to computing constraints, to meet application requirements dedicated specialized systems are created. Few such systems are web servers, predictive analytic systems, geo spatial processing systems and rule engines. This IT landscape significantly increases data latency, data duplication, IT landscape complexity and resources to manage and build applications on these systems. A modern in memory database should be able to execute most of the business logic and advanced analytics inside the database on single copy of data used for both transactions and analytics so that results are always real-time. This helps business become more agile in leveraging opportunities and reacting to the potential risks.

 

4) Provide tools to develop applications faster for any device and integrates well with open source software and other traditional databases: Today’s developers are comfortable with open source application development tools such as Eclipse, and languages such as SQL, JSON, JavaScript and JavaScript based UI frameworks that deliver HTML5 UIs. Open source software technologies such as Hadoop and Apache Spark provide excellent solutions to solve some business problems. In memory database should provide seamless integration with traditional databases and key open source innovations. Developers should not be deciding whether to keep a table in memory or not for every table used in an application. Developers should not be testing their old applications after dropping OLTP indexes because they keep the table in memory. In memory database should provide built in data visualization tools.

 

5) Give choice in deployment options: Organizations expect to consume software thru cloud. Database that can be accessed from public cloud will significantly reduce initial capital expenditure and time it takes to get started.  Once an organization decides to go to production, managed cloud service can provide the security, support and service levels organizations expect. Organizations should be able to run in memory database on premise using commodity servers such as x86 and should fit into software defined data center.

 

In memory database may require initial investment and requires data migration as any database but should have a proven track record of simplifying IT landscape and recurring cost savings. This is in addition to the value organizations expect - become agile and build applications that can give competitive advantage.

 

In summary, a traditional database that depends on disk but gives incremental performance with in-memory enhancements may look good in short term as this gives the ability to run applications as is. But the fact is that it requires more admin and developer work to select the tables to keep in memory and optimize applications in ongoing basis. These databases utilize more system resources to keep less data in memory and to synchronize multiple copies
defeats the purpose of in memory database.

 

Those are my five.  Do you have yours?  What would you add to my list?

 

See how SAP HANA can help @ Innovation & Me

About SAP HANA

SAP HANA - One platform for all applications
SAP HANA – An In-Memory Data Platform for Real-Time Business 

Projected Cost Analysis of the SAP HANA Platform by Forrester

HOWTO ODBC .NET Framework connection with failover support

$
0
0

In his post "HOWTO ODBC .Net framework connection" (http://scn.sap.com/docs/DOC-33628), Keven Lachanced showed us how to connect to HANA using the .NET Framework and ODBC.

 

Keven's post is very helpful and I would like to add to it by showing how you can add scale out failover support to the ODBC connection.

 

Within the SAP HANA Administration Guide under section Configuring Clients for Failover we are instructed to add a list of potential master servers, separated by a semicolon, to the connection string:

 

postImage_1.JPG

I found that this does not work when using ODBC in the .NET Framework.  It seems that the first host in the list is parsed and only that host is used when making connections.

 

For example,  when attempting to add multiple hosts to Keven's example:

 

const string _strServerName = "hanaServer1:30015;hanaServer2:30015;hanaServer3:30015";

 

In this example, only hanaServer1:30015 will be considered for a connection.

 

I found that by adding a comma, not a semicolon, between the host entries, failover support was successfully added to the .NET ODBC connection:

 

const string _strServerName = "hanaServer1:30015,hanaServer2:30015,hanaServer3:30015";

 

In this example, if the connection to hanaServer1:30015 is not successful, then an attempt is made to hanaServer2:30015.  If that connection is not successful then an attempt is made to hanaServer3:30015.  Only when a connect is not possible to any of the servers in the list that an exception is thrown.

Shrink your Tables with SAP HANA SP08

$
0
0

Shrink your Tables with SAP HANA SP08

 

Abani Pattanayak, SAP HANA COE (Delivery)

Jako Blagoev, SAP HANA COE (AGS)

 

 

Introduction:

 

Yes, with HANA SP08 you can significantly reduce size of your FACT tables significantly. Depending on the the size of the primary key and cardinality of the dataset, you can get significant (up to 40%) savings in static in static memory usage.

 

This savings in memory is compared to HANA SP07 or earlier revisions.

 

So what's the catch?


There is no catch.


The saving is based on how the primary key of the table is stored in HANA. Please check the biggest FACT table in your SP07 or SP06 HANA database, the size of the primary key will be around 30 - 40 % of the total size of the table.

 

With HANA SP08, we can eliminate this 30 - 40% memory taken by the primary key. So there is no negative performance impact on query performance.

 

 

Show me the Money (What's the trick)?


You need to recreate the primary key of the table with INVERTED HASH option.

 

CREATECOLUMNTABLE"SAPSR3"."MY_FACT_TABLE"(

        "STORID"NVARCHAR(10),

        "ORDERID"NVARCHAR(15),

        "SEQ"NVARCHAR(10),

        "CALMONTH"NVARCHAR(6),

        "CALDAY"NVARCHAR(8),

        "COUNTRY"NVARCHAR(3),

        "REGION"NVARCHAR(3),

..

..

        PRIMARYKEY INVERTED HASH ("STORID",

        "ORDERID",

        "SEQ",

        "CALMONTH",

        "CALDAY",

        "COUNTRY",

        "REGION"))

WITH PARAMETERS ('PARTITION_SPEC' = 'HASH 8 STORID')

;

 

You can use ALTER TABLE command to drop and recreate primary key of the table.

 

However, if you have a scale-out system or a really BIG fact table with billions of record, We'd highly recommend to create a NEW table with INVERTED HASH Primary Key and then copy the data over to the new table. Then rename the tables.

 

Result

 

The following is the result of updating the primary key in a customer project. As you see below, the saving in static memory is around 531GB over 1980GB.

So overall, there is a saving of at least 2-nodes (0.5 TB each) in a 9-node scale out system.

 

The best part of this exercise, there is no negative performance impact on query performance.

 

Note: I'd suggest you review your existing system and evaluate if you can take advantage of this feature.

 

Shrink Table.PNG

Keeping an eye on SAP HANA

$
0
0

During my SAP HANA classes I often get the question “How should I monitor the SAP HANA systems?” I think that is a good question and in this blog I want to explain how you can monitor your SAP HANA systems.

 

The monitoring options for SAP HANA systems

 

As an SAP System Administrator you need to keep your SAP HANA systems up and running. So what would be an good way of doing this? There are two options and depending on your needs and available infrastructure you should decide which option suits you best.

 

  • You can monitor you SAP HANA systems using SAP HANA Studio
  • You can monitor you SAP HANA systems using SAP Solution Manager

 

 

Monitoring SAP HANA systems using SAP HANA Studio

 

This solution is best for customers that have no SAP ABAP footprint. The monitoring can be done using the monitoring tools included in SAP HANA Studio. Using SAP HANA Studio you can monitor the following areas:

 

  • Overall System Status using the System Monitor view
  • Detailed System Status using the Default Administration view
  • Memory usage per system or service using the Memory Overview view
  • Resource usage using the Resource Utilization view
  • System Alerts using the Alert view
  • Disk Storage using the Volumes view
  • Overall view using the SAP HANA Monitor Dashboard

 

Overall System Status using the System Monitor view

 

In the System Monitor you will find an status overview of all the SAP HANA systems connected to your SAP HANA Studio. As you can see in the screenshot below it shows the system status including some important metrics. With this monitor you can quickly see if all systems are functioning within normal specifications.

 

SAP_HANA_System_Overview.png

 

Detailed System Status using the Default Administration view

 

In the Detailed System Status view you get an detailed overview of the most important metrics from the selected SAP HANA system. With this information you can see if all the different areas (memory, CPU and disk) are running within the given thresholds. If not you can follow the links for an deeper investigation.

 

SAP_HANA_Admin_Overview.png

 

Memory usage per system or service using the Memory Overview view

 

I think that memory usage is one of the most important metrics in an SAP HANA system. That is why there are several nice views available to show the overall memory usage and the the usage per service. These views also give the option to look at the memory usage in an specified time period.  This can be very useful to investigate what happened last night.

 

Memory Overview per System

SAP_HANA_Memory_Overview.png

 

Memory Overview per Service

SAP_HANA_Service_Monitor.png

 

Resource usage using the Resource Utilization view

 

The resource monitor lets me look at CPU, memory and storage metrics in an combined view. It also show me the graph in an specified time period. Using this monitor give me the opportunity to find the root cause of the problem.

 

SAP_HANA_Resource_Utilization.png

 

System Alerts using the Alert view

 

The System Alerts view show all the alerts that have been triggered in the system. I can specify my own thresholds values and for the alters that I think are important I can setup email notification. With is email notification I get informed on important alerts even when I'm looking after other systems.

 

SAP_HANA_Alert_View.png

 

Disk Storage using the Volumes view

 

Even in an In-Memory database disk storage is important, so using the Volumes view you can keep track of the filling level and the IO performance.

 

SAP_HANA_Volumes_View.png

 

The SAP HANA Monitoring Dashboard

 

As of SAP HANA SPS08 the is also an Administration Dashboard available that show the most important metrics in a SAP Fiori Dashboard.

 

SAP_HANA_Dashboard.png

 

I have also recorded a video showing all the SAP HANA Studio monitoring features.

 

 

Monitoring SAP HANA systems using SAP Solution Manager

 

This solution is best for customers that use SAP Solution Manager already for managing and monitoring SAP systems. SAP HANA is added to SAP Solution Manager as a Managed System and from there you can setup Technical System monitoring. The Maintenance Optimizer (MOPZ) and DBACOCKPIT are also fully operational.

 

Using the transactions SM_WORKCENTER and DBACOCKPIT gives as many monitoring capabilities (maybe even more) than what is possible in SAP HANA Studio.

 

I have created a few video's to demonstrate the transactions SM_WORKCENTER, the The Maintenance Optimizer (MOPZ) and DBACOCKPIT.

 

Monitoring SAP HANA using transactions SM_WORKCENTER


This video shows how you can monitor SAP HANA system using SAP Solution Manager 7.1 SP12

 

Monitoring SAP HANA using transaction DBACOCKPIT

 

This video shows the transaction DBACOCKPIT in a SAP Solution Manager 7.1 SP12 connected to a SAP HANA system.

 

Requesting an SAP HANA Support Package Stack using MOPZ

 

This video show how you can use SAP Solution Manager 7.1 sp12 to request and download an Support Package Stack for SAP HANA.

 

If you want to learn more on SAP HANA and SAP Solution Manager then visit the SAP Education website. There is a curriculum for SAP HANA Administration and SAP Solution Manager.

 

You can also visit the SAP Leaning Hub and have a look in my SAP Leaning Room "SAP HANA Administration and Operations".

 

SAP_HANA_Learning_Room.png

 

Have fun watching over SAP HANA

HANA and the SolMan syndrome

$
0
0

I am carefully following the latest HANA (and other unimportant stuff) surveys and SAP representatives’ reaction to it. You may want to do some reading on what happened before reading my blog. If so then here are the useful sources:

  1. ASUG survey that started the whole thing: ASUG Member Survey Reveals Successes, Challenges of SAP HANA Adoption
  2. Blog by Jelena Perfiljeva on the ASUG survey topic and the stir around it: Y U No Love HANA?
  3. Blog by Steve Lucas on the ASUG survey: Thoughts on the ASUG Survey on SAP HANA
  4. Hasso Plattner himself on the ASUG survey results: The Benefits of the Business Suite on HANA
  5. DSAG survey: Demand for innovation potential to be demonstrated
  6. Dennis Howlett on the DSAG survey: Analyzing the DSAG survey on SAP Business Suite on HANA

 

If you made it here, let me share some thoughts with you. I must warn you that I don’t have much real life exposure to HANA and my thoughts are primarily based on what I consume from various sources like SCN or SAP official marketing. Combined with the customer surveys my information sources are very well mixed and hilariously contradict with one another.

 

The SolMan story

But to the point. Namely to the title of this article. When you read about HANA and customer being not so hot about it, what does that remind you about? It reminds me about the Solution manager. Note that I am a Security and Authorizations consultant working primarily with the basis component (which is the foundation of all ABAP based systems). I get to work with Solution manager a lot. I don’t claim to be a SolMan expert but I have enough of them around me to be reasonably well informed about what they do and how is the market for them as well as get to hear feedback from the customers.

I don’t want to discuss some recent confusions and disappointments of some of the customers about changes in the SolMan functionality. I believe the SolMan team on the SAP side is a team of seasoned engineers and they know what they’re doing. What I want to concentrate on is the perception of the SolMan as a symbol of the SAP basis and infrastructure as a whole. At that is very much the same bucket where HANA ends up as well.

Every customer that runs an ABAP based system must run the basis component (BC is the good old name), which means database, user administration, roles and profiles, performance, custom development etc. Every customer must run this and have an internal team (often combined with an external one) to run the systems. SolMan is something that wise people see as a central hub for many (if not most) of these things and if you deploy and use the SolMan wisely, it offers huge benefits. You run jobs in your systems? I am pretty sure you do. Boom here comes the SolMan central monitoring. You do custom development? Whoosh here comes SolMan’s CHARM, CTS+, CCLM etc. Seriously for many basis things (for security things less so) SolMan offers some way how to run everything centrally which in my opinion provides some nice benefits.

But how comes that I can see that many customers not investing into centralized basis operations via the SolMan? How comes that if the budget is cut, SolMan is axed in the first wave? How comes that so many people are trained on MM-Purchasing (random business example) but SolMan experience and understanding of the big picture is so rare?

In my opinion the problem is the following. The companies have a fixed budget they spend on IT. Part of the fixed budget is a fixed budget for SAP. The budget is fixed. Non-inflatable. No magic. Fixed. The SLA between the shared services centre, the competence centre or how you call the team or organization that provides the SAP services (and runs the systems) says that the functionality that keeps the business running must perform well, be secure and available, patched, people trained etc. It is a necessary evil for the rest of the company to have this IT basement and their budget, but that has limits. The budget is fixed and the outside perspective (and priority setting) is on what keeps the company running and making money. Tell me, haven’t you ever joked about “being just a cost center”? We don’t make money, we just keep the servers going.

So back to the SolMan. To leverage the SolMan powers you need trained and knowledgeable people. Such people don’t come cheap. Even more so every year we are older because you have these shiny start-up hubs all around the world, you have cool companies worth billions (like Facebook and Google with free food and a laundry service) and they push prices (of the smart heads) up as well as the number of available smart heads down. Anyway you know what I mean. More costs on people, on their training, on making them happy. Then you need hardware to run the SolMan on, you need to pay for the license, you need external support time to time, more patching, more auditing etc.

And what is the value? What is the benefit? Once you bend the company’s (IT team’s) processes around the SolMan (and win the motivation of the basis folks for the SolMan, all of them!) then you can see some (substantial?) savings (in the hopefully not-so-distant future). And all that only if people commit to use the new features and the size of your organization makes it easier to reach the point of break-even during this lifetime still.

So let’s briefly summarize:

  1. Initial investment in the people, hardware and software
  2. Investment into the change management process that would readjust your people’s mind-sets and processes around the new tool.
  3. Savings are waiting for you in the future, some of them are rather theoretical and others will only arrive into your budget pool if everyone joins the effort.
  4. The good news is that SolMan is around for long enough that you have the knowledge spread around pretty well so hiring someone for your team to run the SolMan is not like hiring a Tesla engineer.
  5. In my opinion good news is that by slowly consolidating some of your process on the SolMan now and others later gives you the possibility to pay a series of small prices and get a series of smaller returns over the time.
  6. Last but not least SolMan does not do that many things that you can’t do without it. Can you name any such things? You can only do them on SolMan? Not with Excel or lots of clicking in the local system?

SolMan is being underestimated. Underused. Underappreciated.

 

The SolMan syndrome

Now back to HANA. Did you try to replace the SolMan with HANA in some of the comments above? Just try it. How comes that so many people are trained on MM-Purchasing (random business example) but HANA experience and understanding of the big picture is so rare?To leverage the HANA powers you need trained and knowledgeable people. Once you bend the company’s (IT team’s) processes around the HANA… Last but not least HANA does not do that many things that you can’t do without it, with Excel or lots of clicking in the system…I know you can see where I am coming from, right?

At this point I am taking a ten minutes break to re-read the Hasso Plattner’s blog…

We can immediately filter some of his points out as they are irrelevant for customers (or maybe it is better to say they are irrelevant for me and I am a trained SAP engineer, I do this for living and the future of my family depends on the success of SAP at least partly, I engage on SCN and I talk to SAP engineers… I think I have showed enough dedication and loyalty than most of the customers).

Anyway I don’t want to argue here with Mr. Plattner as I respect him very much so I will paint my own picture here and you, dear reader, can choose what is closer to your everyday reality.

The two most important things about HANA are:

  1. The cost that the customer must pay for the new ride
  2. The benefit received for that cost

I don’t run a HANA system myself and only a few of my customers do (and they all run BW on HANA regardless of what other HANA options are). So I don’t have any idea about the costs (other than some mentions on the SCN). But I assume these costs are not low. They can’t be for cutting edge innovation (…see more kittens dying?).

We could go on about costs here, but you are a smart person, dear reader, you can get a rough picture about the costs yourself. It is also a bit unfair to complain about costs. In my opinion if the benefit outweighs the costs, it is worth it no matter what the cost is. So let’s concentrate on the value and especially the obstacles in reaping the value and benefits.

As I see it there are two benefits: speed and simplification.

Well let’s start with speed. Let’s assume I can pick random customers of mine and HANA would boost their business through the roof (since the costs would go through the roof as well because I need to pay for the HANA show I can only see the benefit going through the roof to break that even).

Let’s try … a car manufacturer for example (random example, ok?). I have a production line that builds cars. This production line is built very efficiently. This production must never stop (ok, rarely stop in a controlled and planned manner). If I want to improve my earnings or savings, what do I do? I take a screw that I use 50 times in every car and make it 2% cheaper (replace screw with any other part with a value, if you’re from the car manufacturing business; screw is just an example, ok?). How can speed of my IT speed up my business?

Readjusting the production line based on some HANA invention seems to be out of the question – time consuming, expensive etc. (correct me if I am wrong, I welcome a discussion).

Would I change my supply chain based on the HANA fast data? How? I have dozens of main suppliers I depend on, they each have dozens of their suppliers they depend on. I have my supply chain diversified to reduce the risk of my production line going down because I am out of screws. I don’t see HANA helping me with my supply chain. I have long term contracts with my suppliers (which are not easy to change) and I have Just-in-time (JIT) delivery to be super-efficient. Still no signs of HANA here.

Can I improve my distribution channels based on HANA? Maybe I can ship some cars to a country XYZ because I can see a tendency of the demand to go a bit up there. Normal mortals that order a new car either pay (or are given a voucher) for a speed delivery (anything under 3 months or so) or they just wait for those three months. Is sending a couple cars more (that can’t be customized and must be sold as I built them) improve my numbers?

I am not selling the cars I am producing. How can HANA sell my cars? Maybe I am late to the market with the model. Or it is too expensive compared to my competitors. I can either see it (base it on numbers) or not. But if I get the results of such analysis a day faster (assuming HANA cut the time of a long running job from a day to 8 minutes), how does it matter? What is a day in a life cycle of a car model?

 

On speed and people

Speed. That sounds cool right? Car manufacturers sell fast cars for a premium. People like fast cars. Do you like fast cars, dear reader? I would certainly try a couple of them on a German autobahn.

But do people like speed? I don’t think so. Speed means deadlines. It means thinking fast, acting fast. Sometimes it means making mistakes. It means facing risks. It means stress. It means swimming into the unknown with the pace that leaves less than our usual time for re-adjustment. That makes us uncomfortable. Discomfort. I don’t like that. Here is my comfort zone. I don’t want to go …there. I want to stay here. Inertia. Action and reaction.

Sorry for the emotional detour. What I am trying to say is that processes are run by people. They don’t run in machines. No matter how fast one report is (whether it runs on HANA or not) there are people that work with the machine, that provide inputs, collect outputs etc. There is a threshold when system performance becomes a pain. See a website that takes 30 seconds to load. That is annoying right? But if that report that you only run once a week for your team meeting to discuss it takes 12 second or 14, does it matter? Or let’s say that report takes 2 minutes to run. If you could push that down to 2 seconds, would you run the report more often? If you ran the report more often, would there be a benefit for you, your boss or your company in you doing it?

You can’t change people. At least not easily. For many people – the normal mortals and coincidentally users of a SAP system – the IT thing and the whole SAP system is a black-box. That means that when your secretary types in your travel expenses, she will not do it faster because this system runs on HANA. She does not know about HANA. She does not care either. Let’s say you work in the company’s IT and your boss decides your budget (reality, right?). Your boss is not an IT engineer (no matter if it is a lady or a gentleman) or even if so, it is not a HANA fanatic. Probably not even a SAP fanatic. How do you sell such a person your most recent HANA ambition?

If you are in the business for long enough, you must have heard the expression “bend the company around SAP”. Let’s put aside the fact that SAP brings some great industry best practices and such bending can bring lot of good into a company. People don’t like this. They will change their ways if the stimulus is strong enough (less work?) or the pressure is big enough (you must do it or you go). See iPhone. I don’t like Apple in general, but I can see how iPhone had this strong stimulus when it was introduced (it was idiot proof to use it, colourful, entry barrier very low since it is idiot proof etc.) and that is why it became a huge success. Is this the case with HANA? No. Huge adoption barriers and unclear benefit (for a normal mortal, an iPhone type of user). People will stand their ground. You want to bend your company around the new opportunities and reach for new horizons? Well, you must fire the people of wait for them to die (meaning their career at your company…).

IT is a black box for them. It must just work. They don’t care if you run SolMan. They don’t care if you run HANA (unless there is a problem with a vital threshold – like the one with the web page response – how many companies have such hard thresholds? Retailers maybe. Who else?). Technology shift that brings light-fast speed is not the killer trick.

 

...then it must be the simplification!

Then it must be simplification. Hm, ok. What could that mean? Mr. Plattner drops some hints. Simplified ERP? All my systems (CRM, SRM, ERP) put together? No BW system (because it is not needed)?

That sounds very very cool. If you’re the marketing guy and you buy what you sell. Reality check?

It does not sound like you take what you have (current ERP, current CRM, all the systems that you are currently running) and you push a button a voila… a sERP system. I still remember that OSS support ping pong when the landscape optimizer product (or what the official name was) was introduced. So I don’t see it how I put my current systems together into one easily.

I am a developer. I can see the loads of code that live in my system. Tons and tons of code where the quality varies and the “Date create” varies from 199X to yesterday. I have customers with systems full of non-Unicode programs. How could one turn this around into a sERP easily? As a developer myself I know that it is probably easier (from the development process organization point of view, quality standpoint etc.) and also right (because of the new software design trends etc.) to start over. Oh. But that means several things.

That means that SAP will probably start over with what they have. Either partially or completely. That means new bugs. New support ping pongs. New products aspiring for maturity which will take years and loads of frustration.

That also means I will have to implement or re-implement what I have. More consulting. More money needed. And spent. More dependency on externals that learn fast enough to keep themselves up-to-date with what SAP produces.

What about my custom code? If I have this simplified ERP thing now, it has a different data model. Different APIs. Different programs. I may not need my custom programs anymore. Or I may end up with a need for more. Gosh. More assessments. More upgrades. More development. More audits. More.

That was my company. But things also get personal. It is my job that we are talking about here and what happens to my job when there is no CRM, no SRM etc.? Unless I am overseeing something BW comes first here. If BW is not needed anymore because everything is real-time and I am a BW specialist, what will I do for living? I am not needed anymore. I am obsolete. A dinosaur. A fossil. How many customers are out there running BW? What happens with those people if BW is not needed anymore? Will that happen fast? Or over ten years period so they can adjust themselves so they can still keep their families fed and happy and safe?

When I hear about simplification in other areas these days, people translate it into job cutting. Simplified, lean, that means people will get fired. Not every company is so smart to understand that by automating or simplifying things you can give more advanced, more innovative type of work to people that don’t have to perform repetitious tasks anymore. That would boost their motivation. They would push the horizons further. They would have fun doing it (not necessarily everyone, but ok). Some companies lay people off instead.

I know, dear SAP, that you mean well. But you need to explain that better. You need to give people evidence, roadmaps (with meat on the bones), set expectations right, explain how we get from point A to B so that everyone is still on board. Remember SEALs? We leave no man (customer) behind. Tell us how you plan to do it. Dispel fear, confusion.

I know simplification is good. I like the Einstein’s quote (if it was Einstein, I hope so): “If you can explain it simply, you don’t understand it enough”. I don’t think that is the case with SAP. SAP invented the ERP as we know it (my opinion). The data model and the processes and the customizing, the know-how collected and invented hand in hand with millions of customers, all that is super impressive. I am sure SAP will know how to simplify because they know the business well enough (I am just a bit afraid that the individuals that will be responsible for the simplification process will not deliver on a consistent quality level, but that is a different story).

Back to the SolMan beginning. I didn't mean to criticize HANA. I didn't mean to criticize SolMan either. Both products are great. But they way they're sold, the way they are perceived is in my opinion very similar. You don't need them. They improve something you already have. Challenging.

But all will be well one day. For HANA. For customers. For SAP. But it is not that easy how SAP marketing sees it. You still have room for improvement there and please consider if it is not a good idea to fill that room with guidance, with numbers, with evidence, with fighting with your customers hip-to-hip. It is not you and them. It is us.

 

p. s.: Are you a normal mortal and want to read more on HANA? Consider the Owen Pettiford’s Impact of Platform Changes:  How SAP HANA Impacts an SAP Landscape, I quite liked it although it is just an overview.


SAP HANA Distinguished Engineer Badge

$
0
0

You may have noticed that a number of people on SCN have received the SAP HANA Distinguished Engineer Badge! A shiny red star with a RAM chip in the middle!

HANADistinguishedEngineer75png1464f6c666d.png

And... you may be wondering how to get it. The SAP HANA Distinguished Engineer Program looks to recognize those individuals who are positively contributing to the SAP HANA Ecosystem. The badge is achieved via a HANA Distinguished Engineer Nominations process, and the HANA Distinguished Engineer Council meets periodically to review nominations. In addition, we sometimes scour SCN and other places to proactively nominate worthy individuals.

 

What does it take to be a HDE?

 

It's pretty simple. We have a SAP HANA Distinguished Engineer FAQ which describes this in more detail, but there are basically two things.

 

1) Be a HANA practitioner. You need to be working with customers on projects and have real-world experience.

 

2) Share your knowledge. You need to consistently share quality technical content with the public.

 

Everything else is open to interpretation - some HANA product managers work with customers, which is awesome. Some people only share content behind company firewalls, which we don't recognize as public content. Some create more "high level" and "marketing" content, which we recognize as valuable to the community, but we don't recognize that content for this program.

 

What is the purpose of the HDE program?

 

The HDE program looks to further adoption of the SAP HANA platform by encouraging a thriving community of practitioners and recognizing those who would be an asset to any customer project.

 

Why is the community aspect so important?

 

It's part of the core beliefs of the people who setup the program that the best way to help tech is to create a thriving community of content writers and sharing. It's the same reason why we are a huge supporter of the OpenSAP folks.

 

Also note that the HDE program is created by the community, for the benefit of customers. It's sponsored by SAP, and we are very thankful to have Saiprashanth Reddy Venumbaka and Craig Cmehil help lead it, but SAP don't own it.

 

Who can't be a HDE?

 

We get a lot of submissions from people who are really valuable to the ecosystem - trainers, sales, pre-sales, marketing. All that content is really important, but every HDE is someone who customers would want on your project team, so whilst we feel really bad when those individuals are nominated, they can't be HDEs.


We also get a lot of submissions from lots of awesome consultants who don't share technical content publicly. If you don't share content publicly, you can't be a HDE

 

We added a "**** Please note that if there isn't public material linked here, the candidate will not be considered ****" to the SAP HANA Distinguished Engineer Nomination Form but that didn't stop some people from nominating themselves without it!

 

Wow, that's an intimidating list of people!

 

Look, you couldn't have a program like the HDE program without people like Lars Breddemann and Thomas Jung! But, you don't have to be a rock star to be a HDE, just a regular person delivering projects and sharing quality content. That said, we definitely screen the actual content that people produce; if it's in any way negative to the community (or technically inaccurate, or just copies of documentation), we'll pass.


There's actually one individual that the HDE council has invited twice and has declined twice (you know who you are!), because they don't think they have sufficient real world experience.

 

What about diversity?

 

This year, the popularity of SAP HANA has thankfully meant that the HDE program has grown past American, German and British consultants. We have HDEs from Poland, Czech Republic, Argentina, Netherlands, Sweden, Ireland, Canada, India, Brazil and China, which is really cool. But we are ashamed to say that there are no women. Let us know if you can help with that.

 

Does being a HDE help with career progression? What's in it for me?

 

That's a very tricky question because it is very difficult to benchmark. HANA is a very hot technology and experienced resources are always in demand, and the HDE brand is definitely intended to be a mark of good quality resources, but it's up to individual employers to recognize this. Other programs like Microsoft's MVP program are considered to be positive to careers, so it does stand to reason.

 

As for what's in it for you, sharing concepts makes a consultant more rounded and a better communicator. The resume has been replaced by LinkedIn and many employers look for individuals with a brand and referenceability. HDEs get opportunities to speak at events, webinars, to write books and other activities. If you don't see that as good for your career then that's cool, the program just isn't for you.

 

So how do you get that badge?

 

There are four simple steps!

 

1) Sign up to SCN. The home of the HDE program is SCN, so you do need a SCN ID to get the badge!

 

2) Get yourself on a HANA project. You're going to need that real world experience!

 

3) Share what you learnt. Everyone shares in their own way and we don't proscribe a particular way. It can be speaking, writing blogs, forum activity, webcasts, podcasts. Whatever you like. You can be active on SCN, Slideshare, Twitter, Stack Overflow or anywhere else you choose, but remember the content has to be public. That training session you delivered to your peers in Walldorf doesn't count!

 

4) Nominate yourself, or wait for someone else to nominate you. HDEs are chosen on merit, so it's just fine to nominate yourself, we don't mind.

Further explained on Linux Free Memory, Virtual Memory, Resident Memory vs HANA used Memory and Pool Memory

$
0
0

 

 

a) OS Physical Memory– Total amount of memory available on physical host. However, total memory allocation limit for SAP HANA usage is approximate 93-95% (without specify the global allocation limit)


b) Virtual Memory– Reserve memory for SAP HANA processes from Linux OS, and this entire reserved memory footprint of a program is referred to Virtual Memory. SAP HANA virtual memory is the maximal amount that the process has been allocated, including its reservation for code, stack, data, and memory pool under program control. SAP HANA Virtual Memory dynamically growth when more memory needed (eg: table growth, temp computation and etc). When current pool memory can’t satisfy the request, memory manager requesting more memory from OS to this pool memory, up to its predefined memory allocation limit.


c) Resident Memory– the physical memory actually in operational use by a process.


d) Pool Memory– When SAP HANA started, a significant number of memory is requested from OS to this memory pool to store all the in memory data and system tables, thread stacks, temporary computation and other structures that need for managing HANA Database.


d1) only part of pool memory is used initially. When more memory is required for table growth or temporary computations, SAP HANA Manager obtain it from this pool. When the pool cannot satisfy the request, the memory manage will increase the pool size by requesting more memory from the OS, up to the pre-defined Allocation Limit.  Once computations completed or table dropped, freed memory is returned to the memory manager, who recycles it to its pool.


e) SAP HANA Used Memory– total amount of memory currently used by SAP HANA processes, including the currently allocated Pool Memory. The value will dropped when freed memory after each temporary computation and increase when more memory is needed and requested from Pool.


Hope above provide a clearer picture for SAP HANA Memory on Linux and hopefully if time permits, i'll post more on how to analyze current memory usage by SAP HANA (code, stack, shared and heap memory) next based on my analysis done.



Smart Data Access - Basic Setup and Known Issues

$
0
0

Purpose


This blog will focus on basic setup of Smart Data Access (SDA) and then outline some problems that customers have encountered.  Some of the issues outlined in the troubleshooting section come directly from incidents that were created.

 

There is already a lot of information on Smart Data Access which this blog does not aim to replace.  Throughout the blog, I will reference links to other documentation that can cover the topics in more detail.

 


What is Smart Data Access (SDA)?


SDA allows customers to access data virtually from remote sources such as Hadoop, Oracle, Teradata, SQL Server, and more. Once a remote
connection to a data source is done we can virtually connect to the tables and query against is or use in data models as if it were data that resides in a SAP
HANA database.

 

This allows it so customers do not have to migrate or copy their data from other databases into a SAP HANA database.

 

How to Setup SDA?


Smart Data Access was introduced in SAP HANA SP6, so if you intend on using SDA be on at least this revision.

 

Prior to connecting to a remote database you will need to configure an ODBC connection from the server to the remote database. For
assistance on how to install the Database drivers on how to install the database drivers for SAP HANA Smart Data Access please refer to SAP note
1868702 and refer to the SAP HANA Learning  Academy videos


https://www.youtube.com/watch?v=BomjFbJ25vo&index=16&list=PLkzo92owKnVx_X9Qp-jonm3FCmo41Fkzm


Documentation on SDA can be found in the SAP HANA Admin guide chapter 6


http://help.sap.com/hana/SAP_HANA_Administration_Guide_en.pdf

 

 

Changes to SDA?


For information on what upgrades to SDA has occurred with each revision of SAP HANA please feel free to review the following

 

SP6 -> SP 7 delta

 

http://www.saphana.com/servlet/JiveServlet/previewBody/4296-102-7-9005/HANA_SPS07_NEW_SDA.pdf

 

SP7 -> SP 8 delta

 

http://www.saphana.com/docs/DOC-4681

 

What Remote Sources Are Supported (As of SP8)?

 

Hadoop

Teradata

SAP HANA

Oracle 12c

Sybase ASE

Sybase IQ

DB2 (SP8)

Microsoft SQL Server 2012

Apache Spark (SP8)

 

** Please note that you could connect to other databases via ODBC, but please note that we cannot guarantee that it will work. **

 

 

How To Add a Remote Source

 

 

 

Once you have configured your ODBC files to the external data source of your choosing you can setup a connection to that source in
Studio by doing the following (we are using Oracle in our example)

 

  1. Expand your system -> Provisioning
    addpng.png
  2. Right click on the Remote Sources folder and select New Remote Source…
    addremote.png
  3. The main window pane will request you to enter in the connection informationaddadapter.gif
  4. Click on the Adapter Name drop down select the appropriate adapter (for this example we will select Oracle).  The main window pane will request
    you to enter in the connection information
      1. ASE – Adaptive Service Enterprise: version 15.7 ESD#4
      2. Teradata – version 13 and 14
      3. IQ – Version 15.4 ESD#3 and 16.0
      4. HANA – HANA revision 60 and up
      5. HADOOP – HDP 1.3 support added SP7
      6. Generic ODBC – This to connect to other databases that support ODBC protocol, however we do not guarantee that it will
        work
      7. Oracle – Oracle 12c support added in SP7
      8. MSSQL – Microsoft SQL Server ver11 support added in SP7
      9. Netezza – Netezza version 7 added in SP8
      10. DB2– DB2 UDB version 10.1 added in SP8
  5. Fill in connection properties and credentialsORACLE.gif
  6. Press the execute button to save this connection
    executed.gif
    ** As an alternative, you can create a remote source through SQL using the following command: CREATE REMOTE SOURCE <Name>
    ADAPTER "odbc" CONFIGURATION FILE 'property_orcl.ini' CONFIGURATION 'DSN=<DSNNAME>' WITH CREDENTIAL TYPE 'PASSWORD' USING 'user=<USERNAME>;password=<Password>';
  7. Press the Test Connection button to verify the connection to the source is successful
    test.gif
  8. Under the Remote Source you will now see your connection
    created.gif

 

 

How To Access The Virtual Tables

 

  1. After adding your New Remote Source, expand it  to see the users and the tables

    virtualtable.gif
  2. Right click on the table you would like to access and select ‘Add as Virtual Table’
    addvirt.gif
  3. You will then choose the alias name and the schema you would like to add this virtual tablevirt2.gif
  4. After hitting create you will get a confirmation message


    success.gif
  5. Now you can check the schema you have chosen
    virtdone.gif
  6. If you select ‘Open Definition’
    def.gif
  7. You will see under the ‘Type’ it says ‘Virtual’

typevirt.gif

 

 

 

 

Reported Problems

Description: After Restarting the HANA database, object privileges are lost

Resolution: This is resolved in Revision 75.  Documented in note: 2077504 - Smart Data
Access object privileges dropped from role after restarting HANA

 

Description: [08S01][unixODBC][SAP AG][LIBODBCHDBSO][HDBODBC] Communication link failure;-10709 Connect failed (no

reachable host left)

Resolution: This issue was caused by firewall blocking connectivity

 

Description: Issue connecting to Teradata
/opt/teradata/client/13.10/odbc_64/bin> ./tdxodbc: /usr/sap/odbc_drivers/unixODBC-2.3.2/DriverManager/.libs/libodbc.so: no version information available

(required by ./tdxodbc)

Resolution: tdxodbc is loading the incorrect libodbc.so library.  The $LD_LIBRARY_PATH environment variable should have the Teradata libraries appear before the UNIX ODBC Driver Manager libraries

 

Descritpion: Numeric overflow when to execute query on virtual table.
Error will look like:

com.sap.dataexplorer.core.dataprovider.DataProviderException:
Error: [314]: numeric overflow: 5.00 at

com.sap.ndb.studio.bi.datapreview.handler.DataPreviewQueryExecutor.executeQuery(DataPreviewQueryExecutor.java:192)at

com.sap.dataexplorer.ui.profiling.ProfilingComposite$12.run

Resolution: This issue is resolved in Revision 74.

 

Description: The GROUP BY &  Aggregation is not pushed down to the remote SDA source

using graphical calculation view.
Remote Source: This incident was reported with IQ
Resolution: HANA will not be able to push down aggregate functions to IQ via SDA until HANA

SPS9


Description: After configuration an Oracle Remote Source, the "Adapter Name" will switch to MSSQL (Generic ODBC)
Resolution: This issue simply a display issue.  It will still use the correct Oracle Adapters. This issue has been reported to development, but not currently scheduled to be fixed.

 

 

Description: After changing parameters of the SDA connection, the virtual tables disappear:
Remote Source: IQ 16.03
Solution: Edit the remote source parameters using "ALTER REMOTE SOURCE <remote_source_name>"


Description: SQL Server nvarchar(max) fields don't work with Smart Data Access.  Error:

SAP DBTech JDBC: [338]: zero-length columns are not allowed: LONGTEXT: line 1 col 209 (at pos 208)

Resolution: Planned to be fixed in SP09

 

** MORE issues will be updated at the bottom **

 

Authors

 

Man-Ted Chan

Jimmy Yang

Man-Ted Chan

New Statistics Server Implementation

$
0
0

*This is a repost due to request*


Hi All,


My name is Man-Ted Chan and I’m from the SAP HANA product support  team. Today’s blog will be about the new SAP HANA Statistics Server. We will review some background information on it, how to implement it, and what to look for to verify it was successful.

 

What is the Statistics Server?

 

The statistics server assists customers by monitoring their SAP HANA system, collecting historical performance data and warning them of system alerts (such as resource exhaustion). The historical data is stored in the _SYS_STATISTICS schema; for more information on these tables, please view the statistical views reference page on help.sap.com/hana_appliance

 

What is the NEW Statistics Server?

 

The new Statistics Server is also known as the embedded Statistics Server or Statistics Service. Prior to SP7 the Statistics Server was a separate server process - like an extra Index Server with monitoring services on top of it. The new Statistics Server is now embedded in the Index Server. The advantage of this is to simplify the SAP HANA architecture and assist us in avoiding out of memory issues of the Statistics Server, as it was defaulted to use only 5% of the total memory.

 

In SP7 and SP8 the old Statistics Server is still implemented and shipped to customers, but can migrate to the new statistics service if they would like by following SAP note 1917938.

 

How to Implement the New Statistics Server?

 

The following screen caps will show how to implement the new Statistics Server. I also make note of what your system looks like before and after you perform this implementation (the steps to perform the migration are listed in SAP note 1917938 as well).

 

In the SAP HANA Studio, view the landscape and performance tab of your system and you should see the following:

1.png2.png

Prior to migrating to the new statistics server please take a back of your system, once that is done please do the following:

Go to the Configuration tab and expand nameserver.ini-> statisticsserver->active

3.png

Double click on the value ‘false’ and enter the new value ‘true’ into the following popup:

4.png

After pressing ‘Save’ the Configuration tab will now show the following:

5.png

Once this done check the ‘Landscape’ and ‘Performance’ tab.

6.png

7.png

As we can see there Statistics Server is now gone. Do not restart your system during this migration, to check the status of the migration please run the following:

SELECT * FROM _SYS_STATISTICS.STATISTICS_PROPERTIES wherekey = 'internal.installation.state'

The results will show the status and the time of the deployment.

Key

Value

  1. internal.installation.state
Done (okay) since 2014-09-20 02:55:34.0360000

 

Do not restart your SAP HANA system until the migration is completed


Trace Files

If you run into issues implementing the new statistics server then we will need to look into the SAP HANA trace files.Logs that we can check during the implementation of the new Statistics Server are the following:

  • Statistics Server trace
  • Name Server trace
  • Index Server trace

If the deployment does not work review the trace files to pin point where an error occurred.Below I have examples of trace snippets of a successful deployment of the embedded Statistics Service.

Statistics Server Trace

In the Statistics Server trace we will see the statistics server shutting down:[27504]{-1}[-1/-1] 2014-09-20 02:55:37.669772 i Logger BackupHandlerImpl.cpp(00321) : Shutting down log backup, 0 log backup(s) pending[27172]{-1}[-1/-1] 2014-09-20 02:55:38.340345 i Service_Shutdown TrexService.cpp(05797) : Disabling signal handler[27172]{-1}[-1/-1] 2014-09-20 02:55:38.340364 i Service_Shutdown TrexService.cpp(05809) : Stopping self watchdog[27172]{-1}[-1/-1] 2014-09-20 02:55:38.340460 i Service_Shutdown TrexService.cpp(05821) : Stopping request dispatcher[27172]{-1}[-1/-1] 2014-09-20 02:55:38.340466 i Service_Shutdown TrexService.cpp(05828) : Stopping responder[27172]{-1}[-1/-1] 2014-09-20 02:55:38.341478 i Service_Shutdown TrexService.cpp(05835) : Stopping channel waiter[27172]{-1}[-1/-1] 2014-09-20 02:55:38.341500 i Service_Shutdown TrexService.cpp(05840) : Shutting service down[27172]{-1}[-1/-1] 2014-09-20 02:55:38.350884 i Service_Shutdown TrexService.cpp(05845) : Stopping threads[27172]{-1}[-1/-1] 2014-09-20 02:55:38.354348 i Service_Shutdown TrexService.cpp(05850) : Stopping communication[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356233 i Service_Shutdown TrexService.cpp(05857) : Deleting console[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356240 i Service_Shutdown TrexService.cpp(05865) : Deleting self watchdog[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356260 i Service_Shutdown TrexService.cpp(05873) : Deleting request dispatcher[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356278 i Service_Shutdown TrexService.cpp(05881) : Deleting responder[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356302 i Service_Shutdown TrexService.cpp(05889) : Deleting service[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356444 i Service_Shutdown TrexService.cpp(05896) : Deleting threads[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356449 i Service_Shutdown TrexService.cpp(05902) : Deleting pools[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356454 i Service_Shutdown TrexService.cpp(05912) : Deleting configuration[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356458 i Service_Shutdown TrexService.cpp(05919) : Removing pidfile[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356515 i Service_Shutdown TrexService.cpp(05954) : System down


Name Server Trace

 

In the Name Server trace you will see it being notified that the Statistics Server is shutting down and the topology is getting updated.An error that you might encounter in the Name Server trace is the following:STATS_CTRL       NameServerControllerThread.cpp(00251) : error installingPlease review SAP note 2006652 to assist you in resolving this.Below is a successful topology update:

  1. NameServerControllerThread.cpp(00486) : found old StatisticsServer: mo-517c85da0:30005, volume: 2, will remove it

[27065]{-1}[-1/-1] 2014-09-20 02:55:34.050358 i STATS_CTRL NameServerControllerThread.cpp(00489) : forcing log backup...

[27065]{-1}[-1/-1] 2014-09-20 02:55:34.051287 i STATS_CTRL NameServerControllerThread.cpp(00494) : log backup done. Reply: [OK]

--

[OK]

--

[27065]{-1}[-1/-1] 2014-09-20 02:55:34.051292 i STATS_CTRL NameServerControllerThread.cpp(00497) : stopping hdbstatisticsserver...

[27065]{-1}[-1/-1] 2014-09-20 02:55:34.054522 i STATS_CTRL NameServerControllerThread.cpp(00522) : waiting 5 seconds for stop...

[27426]{-1}[-1/-1] 2014-09-20 02:55:34.323824 i Service_Shutdown TREXNameServer.cpp(03854) : setStopping(statisticsserver@mo-517c85da0:30005)

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.054777 i STATS_CTRL       NameServerControllerThread.cpp(00527) : hdbstatisticsserver stopped

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.054796 i STATS_CTRL NameServerControllerThread.cpp(00530) : remove service from topology...

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.056706 i STATS_CTRL       NameServerControllerThread.cpp(00534) : service removed from topology

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.056711 i STATS_CTRL NameServerControllerThread.cpp(00536) : remove volume 2 from topology...

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.058031 i STATS_CTRL NameServerControllerThread.cpp(00540) : volume removed from topology

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.058038 i STATS_CTRL NameServerControllerThread.cpp(00542) : mark volume 2 as forbidden...

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.059263 i STATS_CTRL NameServerControllerThread.cpp(00544) : volume marked as forbidden

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.059269 i STATS_CTRL NameServerControllerThread.cpp(00546) : old StatisticsServer successfully removed

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.060823 i STATS_CTRL NameServerControllerThread.cpp(00468) : removing old section from statisticsserver.ini: statisticsserver_general

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.072798 i STATS_CTRL NameServerControllerThread.cpp(00473) : making sure old StatisticsServer is inactive statisticsserver.ini: statisticsserver_general, active=false

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.083604 i STATS_CTRL NameServerControllerThread.cpp(00251) : installation done

[27065]{-1}[-1/-1] 2014-09-20 02:55:39.083620 i STATS_CTRL NameServerControllerThread.cpp(00298) : starting controller

 

Index Server Trace

 

The statistics service is a set of tables and SQL procedures, so if you check the index server trace you will see the deployment of version SQL procedures, and an error could occur during the SQL execution.

Here is an example of a successful deployment:

upsert _SYS_STATISTICS.statistics_schedule (id, status, intervallength, minimalintervallength, retention_days_current, retention_days_default) values (6000, 'Idle', 300, 0, 0, 0) with primary key;

END;

[27340]{-1}[-1/-1] 2014-09-20 02:55:29.802118 i TraceContext     TraceContext.cpp(00718) : UserName=

[27340]{-1}[-1/-1] 2014-09-20 02:55:29.802110 i STATS_WORKER     ConfigurableInstaller.cpp(00168) : creating procedure for 6000: CREATE PROCEDURE _SYS_STATISTICS.Special_Function_Email_Management (IN snapshot_id timestamp, OUT was_cancelled integer) LANGUAGE SQLSCRIPT SQL SECURITY INVOKER AS

-- snapshot_id [IN]: snapshot id

-- was_cancelled [OUT]: indicator whether the specialfunction has been cancelled

l_server string;

 

How do I check if it is running?

 

If you suspect that your new Statistics Service is not running you can check under the

Performance ->Threads tab

8.png

 

Or you can run the following query:

select * from"PUBLIC"."M_SERVICE_THREADS"where thread_type like'%ControllerThread (StatisticsServer)%'

 

How Do I Revert Back?

 

If for some reason you need to go back to the original Statistics Server, you will not be able to just change the value of

nameserver.ini-> statisticsserver->active back to false, but you will have to perform a recovery to a time before you performed the migration.

Build your own Wikipedia Keynote Part 1 - Build and load data

$
0
0

So by now you may have seen the Wikipedia page counts model that we built for the keynote. I'll blog later on about the logistical challenges of getting 30TB of flat files and building a system to load it in 10 days, but since SAP TechEd & d-code is a conference with a load of developers, I'm going to spend some time showing you how to build your own.

 

The beauty of the SAP HANA platform is that you can build this whole model inside the HANA platform using one tool - HANA Studio

 

Background

 

The background is that Wikipedia publishes a list of Page view statistics for Wikimedia projects, which are consolidated page views per hour by title and by project. These go back to 2007 and are now a total of nearly 250bn rows. It's a fascinating dataset because the cardinality is really rough on databases, there are over 4m articles in the English version alone and well over that including all projects.

 

The total dataset is around 30TB of flat files, which translates into around 6TB of HANA database memory required. The SAP HANA Developer Edition is a 60GB Amazon EC2 cloud system, and so you can comfortably fit around 30GB of RAM, so you can fit around 0.5% of the overall dataset. That means we can comfortably fit around 3 weeks of data. This should be enough for you to have some fun!

 

So how do you get started? Trust me, its very easy to do!

 

Step 1 - get SAP HANA Developer Edition

 

N.B. This is free of charge from SAP, but you will have to pay Amazon EC2 fees. Be mindful of this and turn off the system when you're not using it.

 

It takes a few minutes to setup, because you have to configure your AWS account to receive the HANA Developer Edition AMI, but Craig Cmehil and Thomas Grassl have done a great job of making this easy, so please go ahead and configure the HANA Developer Edition!

 

You can of course use an on-premise version or any other HANA instance, though our scripts do assume that you have internet access, so your system doesn't, then you will adapt them. That's part of the fun, right!

 

Step 2 - Create and build the model

 

For the purposes of this exercise, this couldn't be easier as there's just one database table. Note that we use a HASH partition on TITLE. In the big model, we actually use a multilevel partition with a range on date as well, but you won't need this for just 3 weeks. The HASH partition is really handy as we are mostly searching for a specific title, so we can be sure that we'll only hit 1/16th of the data for a scan. This won't hurt performance.

 

Also note that there's a 2bn row limit to partition sizes in HANA, and we don't want to get near to that (I recommend 2-300m rows max as a target). HASH partitioning is neat, because it evenly distributes values between partitions.

 

Also note that we use a generated always statement for date. Most of the time we're not interested in timestamp, and it's very expensive to process the attribute vector of timestamps when you only need the date. Materializing the date allows for a minimum of 24x more efficient time series processing.

 

CREATE USER WIKI PASSWORD "Initial123";

DROP TABLE "WIKI"."PAGECOUNTS";

CREATE COLUMN TABLE "WIKI"."PAGECOUNTS" (

    "WIKITIME" TIMESTAMP,

    "WIKIDATE" DATE GENERATED ALWAYS AS to_date("WIKITIME"),

    "PROJECT" VARCHAR(25),

    "TITLE" VARCHAR(2048),

    "PAGEVIEWS" BIGINT,

    "SIZE" BIGINT) PARTITION BY HASH(TITLE) PARTITIONS 16;

 

Step 3 - Download and load the data

 

The friendly folks at Your.org maintain an excellent mirror of the Wikipedia Page Views data. There are a few challenges with this data, from a HANA perspective.

 

First, it comes in hourly files of around 100MB, which means you have to process a lot of files. So, we wrote a batch script that allows processing of a lot of files (we used this script on all 70,000 files in a modified form to allow for much more parallel processing than your AWS developer instance can cope with!).

 

Second, they are gzipped, and we don't want to unzip the whole lot as that would be huge and takes a lot of time. So the script unzips them to a RAM disk location for speed of processing.

 

Third, the files are space delimited and don't contain the date and time in them, to save space. For efficient batch loading into HANA without an ETL tool like Data Services, we reformat the file before writing to RAM disk, to contain the timestamp as the first column, and be CSV formatted with quotes around the titles.

 

Anyhow, the script is attached as hanaloader.sh. You need to copy this script to your AWS system and run it as the HANA user. Sit back and relax for an hour whilst it loads. The script is uploaded as a txt file so please remember to rename as .sh

 

Please follow these instructions to run the script:

 

-- login to the server as root and run the following:


     mkdir /vol/vol_HDB/sysfiles/wiki

     chown hdbadm:sapsys /vol/vol_HDB/sysfiles/wiki

     chmod u=wrx /vol/vol_HDB/sysfiles/wiki

     su - hdbadm

     cd /vol/vol_HDB/sysfiles/wiki


-- place wikiload.sh in this folder


-- edit wikidownload.sh as described in the comments in the file


-- Once ready run as follows:

     ./ wikidownload.sh 2014 10

 

Step 4 - Install HANA Studio

 

Whilst this is loading, go ahead and get SAP Development Tools for Eclipse installed. If you have access to SAP Software Downloads, you could alternatively use HANA Studio. Make sure you are on at least Revision 80, because otherwise the developer tools won't work.

 

Step 5 - Testing

 

Well now you have a database table populated with 3 weeks of Wikipedia page views. You can test a few SQL scripts to make sure it works, for example:

 

SELECT WIKIDATE, SUM(PAGEVIEWS) FROM WIKI.PAGECOUNTS GROUP BY WIKIDATE;

SELECT WIKIDATE, SUM(PAGEVIEWS) FROM WIKI.PAGECOUNTS WHERE TITLE = 'SAP' GROUP BY WIKIDATE;

 

Note how when you filter, performance dramatically improves. This is the way that HANA works - it's far faster to scan (3bn scans/sec/core) than it is to aggregate (16m aggs/sec/core). That's one of the keys to HANA's performance.

 

Next Steps

 

This is just the first of a multi-part series. Here's what we're going to build next:

 

Part 2: Building the OLAP model. We use the HANA Developer Perspective to build a virtual model that allows efficient processing and ad-hoc reporting in Lumira.

Part 3: Predictive Analysis. We build a predictive model that allows on the fly prediction of future page views.

Part 4: Web App. We expose the model via OData and build a simple web app on the top using SAP UI5.

 

I just want to say a big thanks to Werner SteynLars Breddemann, Brenton OCallaghan and Lloyd Palfrey for their help with putting all this together.

 

Keep tuned for next steps!

SAP HANA talks to Twitter for Prediction !!

$
0
0

This blog is intended to focus on how you can connect with social media like for eg. Twitter and work on text analysis or predictive analysis with the social media data. Use cases could be the Football World Cup, EPL, Cricket IPL T20 and many predictive apps.

 

Twitter has become one the most widely used platform for its trending data using the # (hash tags).

 

I came across a use case where we worked on the predictive analysis with the tweets. Shared here are some of the key points on “How we go about getting this connectivity between HANA & Twitter” and “How we read these tweets, store in HANA DB and take it further for predictive analysis”.

In this example I have used SAP HANA XSJS. You can use the JAVA API’s as well.


With HANA XSJS I tried with two solutions 1) UI request -> XSJS -> Twitter -> Response back to UI

2) Use of XS Jobs for getting Tweets + Separate Service call for UI rendering.


Ok now let’s go ahead and see how it’s done.

 

With Twitter’s new authentication mechanism below steps are necessary for having a successful connection.

 

HttpDest would look like this:

 

description = "Twitter";
host = "api.twitter.com";
port = 443;
useProxy = false;
proxyHost = "proxy";
proxyPort = 8080;
authType = none;
useSSL = true;

Another important step would be to setup the TRUST STORE from XS ADMIN tool :

 

Outbound httpS with HANA XS (part 2) - set up the trust relation

 

Twitter offers applications to issue authenticated requests on behalf of the application itself (as opposed to a specific user).

 

We need to create a twitter application (Manage Aps) in https://dev.twitter.com/. The settings tab/OAuth tool would give us the Consumer Key and secret Key which is very important for setting up the Request authorization header.

 

OAuth keys.JPG

Critical steps is to get the bearer token ready:

  1. URL encode the consumer key to RFC 1738 - xvz1evFS4wEEPTGEFPHBog
  2. The consumer secret to RFC 1738 - L8qq9PZyRg6ieKGEKhZolGC0vJWLw8iEJ88DR
  3. Concatenate the encoded consumer key, a colon character “:”, and the encoded consumer secret into a single string.

        xvz1evFS4wEEPTGEFPHBog:L8qq9PZyRg6ieKGEKhZolGC0vJWLw8iEJ88DR

   4. Base64 encodethe string from the previous step -  V2RlM0d0VFUFRHRUZQSEJvZzpMOHFxOVBaeVJnNmll==


There you have the BASIC token. We need to get a BEARER token from the BASICtoken by issuing a POST:

https://dev.twitter.com/oauth/reference/post/oauth2/token

 

Response:

HTTP/1.1 200 OK

Status: 200 OK

Content-Type: application/json; charset=utf-8

Content-Encoding: gzip

Content-Length: 140

{"token_type":"bearer","access_token":"AAAA%2FAAA%3DAAAAAAAA"}

 

The BEARER token is the gateway for using in your XSJS service to talk to twitter. For checking your twitter URL request format you can use the Twitter Developer Console.


Console.JPG

Response:

 

Reponse.JPG

 

XSJS code snippet :

 

var dest = $.net.http.readDestination("Playground", "twitter");
var client = new $.net.http.Client();
var url_suffix = "/1.1/search/tweets.json?q=#SAPHANA&count=1";
var req = new $.web.WebRequest($.net.http.GET, url_suffix); //MAIN URL
req.headers.set("Authorization","Bearer AAAAAAAA");
client.request(req, dest);

 

If all is setup correctly we would get a Response that gives you the Tweet text in Statuses ARRAY..

 

var response = client.getResponse();
var body = response.body.asString();
myTweets = JSON.parse(body);

myTweets.statuses[index].text ===> Tweet data

 

Once you have the tweets array, you can loop it, process and store them in our SAP HANA DB for further predictive analysis

 

var conn = $.db.getConnection();
var pstmt = conn.prepareStatement(insert_hana_tweet);
pstmt.setString(1, myTweets.statuses[index].id_str);
pstmt.setString(1, myTweets.statuses[index].text);                               
pstmt.setString(2, myTweets.statuses[index].created_at);
pstmt.setString(3, myTweets.statuses[index].user.screen_name);
pstmt.execute();
conn.commit();
pstmt.close();
conn.close();


In my use case I have used "SEARCH" tweets but you have many other options eg. Read Messages, retweets, followers & so on.

The request parameters can be used based on your use case eg. Most recent tweet (result_type=recent), number of tweets to fetch (count=10), fetch all tweets after a specific tweet id(since_id).

 

I have the XS service in a XS JOB which would check for tweets and store them in HANA DB.

 

{
"description": "Insert Tweets",    "action": "Playground:job_twitter.xsjs::collectTweet",
"schedules": [        {            "description": "Tweets",
"xscron": "* * * * * * *"        }    ]
}

 

Have a look at the developer guide to understand the various options available for scheduling... XSCRON parameter is used for setting the time and duration of the job.

Talk to Twitter and keep trending Hope this post was helpful !

 

Avinash Raju

SAP HANA Consultant

www.exa-ag.com

Unleashing Lightening: Building a quarter trillion rows in SAP HANA

$
0
0

For the Keynote at SAP TechEd/d-code 2014, we built out a quarter trillion row model in a single scale-up HANA system. You can read the high level Unleashing Lightening with SAP HANA overview and watch the video.

 

I thought that people might be interested in how the demo was built, and the logistical and technical challenges of loading such a large data model.

 

Building the SGI UV300H

 

The first challenge we had was finding a system big enough for 30TB of flat files in short time. The SGI UV300H is a scale-up HANA appliance, made up from 4-socket building blocks. The SGI folks therefore had to string 8 of these together using their NUMAlink connectors and attach 6 NetApp direct attached storage arrays for a total of 30TB of storage.

 

Today, only 4- and 8-socket SGI systems are certified and the 32-socket system is undergoing certification. The nature of the UV300H means that there is non-uniform data access speed. On a 4-socket Intel system you have either local (100ns) or remote (300ns) - you can read Memory Latencies on Intel® Xeon® Processor E5-4600 and E7-4800 product families for more details.

 

With the NUMAlink system there is also a hop via NUMAlink to the remote chassis, which increases the memory latency to 500ns. Whilst that is blindingly fast by any standard, it increases the non-uniformity of RAM access on HANA. For SAP HANA SPS09, SAP optimized HANA for the UV300H by improving average memory locality.

 

However HANA SPS09 wasn't available, so we ran on stock SAP HANA SPS08 Revision 83. It's tough to say how big a penalty this cost us, but on a theoretical 7.5bn aggregations/sec, we got closer to 5bn, so I'm guessing SPS09 would provide a 25-50% hike in performance under certain circumstances.

 

But to be clear, this is the same HANA software that you run on any HANA server, like AWS Developer Edition. There was no customization involved.

 

Downloading 30TB of flat files

 

Here was our next challenge. I did the math, and realized this was going to take longer than the time available, so I put a call into Verizon FIOS to see if they could help. They came out the next day and installed a new fiberoptic endpoint which could service up 300/300Mbit internet. With my laptop hard-wired into the router, we could get a constant 30MByte/sec download from the Your.Org Wikimedia Mirror. Thanks guys!

 

Once these were on USB hard disks, we shipped them to SGI Labs, which cost another 4 days, due to the Columbus Day holiday.

 

From there, we found we could load into HANA faster than we could copy the files onto the server (USB 2.0).

 

Building the HANA Model

 

Thankfully, I have a few smaller HANA systems in my labs, so I tested the configuration on a 4S/512GB system with 22bn rows, and on a scale-out 4x4S/512GB system with 100bn rows. There were a few things that we found that would later be of value.

 

First, partitioning by time (month) is useful, because you can load a month at a time, and let HANA merge and compress the last month whilst you load the next month. This saves the constant re-merging that happens if you don't partition by time. A secondary partition by title is useful, because it ensures partition pruning during data access, which means that much less RAM is scanned for a specific query. This led to a RANGE(MONTH), HASH(TITLE) two-level partition strategy, which is very typical of data of this type.

 

Second, the amount of data we had meant that it was going to be most practical to logically partition the data into tables by year. This wasn't strictly necessary, but it meant that if something went wrong with one table, it wouldn't require a full load. This decision was vindicated because user error meant I wiped out one table the night before the Keynote, and it was easily possible to reload that year.

 

Third, a secondary index on TITLE was used. This was based on research by Lars Breddemann and Werner SteynFurther Playing with SAP HANA which led us to understand that when a small amount of data is selected from a large table, an index on the filter predicate column is beneficial. Therefore if the SQL query is SELECT DATE, SUM(PAGEVIEWS) FROM PAGECOUNTS WHERE TITLE = 'Ebola' GROUP BY DATE, then a secondary index on TITLE will increase performance.

 

Fourth, we built a simple HANA model to UNION back in all the tables in a Calculation View, and join it to the M_TIME_DIMENSION system table so we could get efficient time aggregation in OData and ensure query pruning.

 

Optimizing SAP HANA for the UV300H

 

By this time, we had 30TB of flat files on the /hana/shared folder of the UV300H and had to get them loaded. We realized there was a challenge, which is the Wikipedia files come in space delimited, with no quotes around text, and the date is in the filename, not a column. We didn't have Data Services or another ETL product, and the fastest way to get data into HANA is using the bulk loader.

 

So, I wrote a script which uncompressed the file into a memory pipe, reformatted it in awk to contain the timestamp and convert it to CSV with quotes, write it out to a RAMdisk, run it into the bulk loader and delete the RAMdisk file. Each hour takes around 20 seconds to process, and I ran 12 threads, in parallel, plus an additional 40 threads for the bulk loader process.

 

What we realized at this point was that SAP HANA SPS08 wasn't optimized for the amount of power that the UV300H had, so I tweaked the settings to be more aggressive, particularly with mergedog, which only uses 2 CPU cores by default. We enabled the integrated statistics server, installed PAL and the script server.

 

In addition, I found that it was necessary not to be too aggressive, because the log volume is only 500GB, and you can easily fill this up between 5 minute savepoints if you get too aggressive with loading (remember you have to buffer enough logs until the savepoint is complete). I suspect the certified 32-socket system will have a 1 or 2TB log volume for this reason.

 

Other than that, we pretty much found that it just worked. Here's a screenshot of using all the 960 vCores in the UV300H during some early query testing. I'm sure glad I didn't pay for the power bill!

Screen Shot 2014-10-14 at 3.41.44 PM.png

Building the Web App in HANA XS

 

We put together the Web App in HANA XS using SAP UI5 controls and OData services to access the underlying data model. More on this in a later blog when Brenton OCallaghan is going to describe how it was built.

 

What's critical about this is that the OData services which is accessed directly by the browser runs in-memory, and has access directly to the Calculation Scenario which is generated by the SAP HANA Calculation View. This means that the response time in the browser is very little more than a SQL query ran in a console on the server itself.

 

There were really no special considerations required to use HANA XS with a model of this size - it worked exactly the same as for any other HANA model. One thing we did to ensure we didn't cause problems was to restrict the HANA models so you couldn't return very large data volumes by using Input Parameters. This means you can't return 250bn rows in a browser!

 

Final Words

 

I've said this in the other blog, but whilst there were huge logistical challenges in building a model like this in 10 days, HANA made it possible. The fact that HANA self-optimizes the whole model for compression and query performance and requires no tuning is a huge benefit. Once we had built a simple data model, we were able to directly load all the data overnight.

 

One thing that was worth noting is because of the direct attached storage model in the UV300H, we found we can load around 200GB/minute into RAM (once it has been loaded and merged once). That means we can load the entire 6TB model on this system in around 30 minutes, which is the fastest load speed I've ever seen on a HANA system.

 

Anyhow, the purpose of this blog was to open the kimono on specifically how we built this demo, and to show that there was no special optimization to do so. This, despite the fact that the UV300H 32-socket edition certification is still in progress and the HANA optimizations for it weren't available to us. If you have any questions on it then please go ahead and ask them, I'd be very happy to answer.

 

And remember if you'd like to Build your own Wikipedia Keynote Part 1 - Build and load data then please follow that series - we'll be building out the whole demo in 4 parts.


Setup SAML SSO from BI to HANA using SapCryptoLib

$
0
0

Overview

This blog is intended to use SAP crypto library to enable SAML SSO from SAP BI4 to SAP HANA DB. If you want to use OPENSSL instead, please check the other SCN blog for details.

 

Turn on SSL using SAP Crypto Library

 

1.     Install SAP Crypto library

SAP Crypto Library can be downloaded from Service Market Place. Browse to http://service.sap.com/swdc, expand Support Packages and Patches "Browse our Download Catalog "SAP Cryptographic Software" SAPCRYPTOLIB" SAPCRYPTOLIB 5.5.5 "Linux on x86_64 64bit.

 

The new CommonCryptoLib (SAPCRYPTOLIB) Version 8.4.30 (or higher) is fully compatible with previous versions of SAPCRYPTOLIB, but adds features of SAP Single Sign-On 2.0 Secure Login Library. It can be downloaded in this location:

expand Support Packages and Patches "Browse our Download Catalog "Additional Components " SAPCRYPTOLIB "COMMONCRYPTOLIB 8


Use SAPCAR to extract sapgenpse and libsapcrypto.so to /usr/sap/<SID>/SYS/global/security/lib/

Add the directory containing the SAP Crypto libraries to your library path:

  export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/sap/<SAPSID>/SYS/global/security/lib

 

2.     Create the SSL key pair and certificate request files

  • Copy the sapgenpse to $SECUDIRdirectory. Then run sapgenpse to generate sapsrv.pse file and SAPSSL.req file:

  ./sapgenpse get_pse -p sapsrv.pse -r SAPSSL.req "CN=<FQDN of the host>"

 

  • Send the Certificate Request to a Certificate Authority to be signed. Browse to http://service.sap.com/trust, and expand SAP Trust Center Services in Detail, and click SSL Test Server Certificates, and then click the ‘Test it Now!’ button. Paste the content from the SAPSSL.req file to the text box, and click Continue.
    1.png
    SAP returns the signed certificate as text, copy this text and paste it into a file on the HANA server: 
    /usr/sap/<sid>/HDB<instance_nr>/<hostname>/sec/SAPSSL.cer
  • Download the  SAP SSL Test Server CA Certificate from the http://service.sap.com/trust site:
    6.png


  • Import the Signed Certificate using sapgenpse
    ./sapgenpse import_own_cert -c SAPSSL.cer -p sapsrv.pse -r SAPServerCA.cer
3. Check HANA settings
Indexserver.ini->[Communication]->sslcryptoprovider = sapcrypto

 

 

4.Restart HANA, and test if SSL works from HANA studio


Click on the "Connect using SSL" option in the properties of the connection.  Once done, a lock will appear in the connection in HANA Studio
2.png

Create Certificate file for BO instance.

 

  1. Create HANA Authentication connection
    Log onto BO CMC" Application" HANA Authentication, click New. After provide HANA Hostname and port, and IDP name, click the Generate button, and click OK button so that you will see an entry added for HANA authentication
    10-22-2014 10-07-46 AM.png
  2. Copy the content of the generated certificate and paste it to a file on your HANA server:

    /usr/sap/<sid>/HDB<instance_nr>/<hostname>/sec/sapid.cer
  3. Add the certification to the pse file:

./sapgenpse maintain_pk -p sapsrv.pse -a sapid.cer

3.png

4. You may need to Restart HANA to make the new pse file take effect.

 

SAML configuration in HANA

 

  1. Create SAML provider in HANA


You could import the SAML identity provider from the certificate file (sapid.cer) which you created from last step in Security->Open security Console -> SAML Identity Providers. Make sure you have chosen the SAP Cryptographic Library.

5.png

 

2. Create a HANA user TESTUSER with SAML authentication.

Check the SAML option, click the Configure link, then Add the Identity Provider created in last step 'HANA_BI_PROVIDER' for the external user 'Administrator'

4.png

 

 

Test SAML authentication

 


Go to BO CMC" Application" HANA Authentication, edit the entry created in previous step, click "Test Connection" button.

7.png

 

Troubleshooting

If the connection test is not successful, please change the trace level of the following to DEBUG:


indexserver.ini - authentication, xssamlproviderconfig


The index server trace will provide more information on why the authentication failed.

 

Reference

 

How to Configure SSL for SAP HANA XSEngine using SAPCrypto

Configuring SAML with SAP HANA and SAP BusinessObjects 4.1 - Part 1

Use SAML to Enable SSO for your SAP HANA XS App

Troubleshooting Standalone Statisticsserver – out of memory issue:

$
0
0

From below statisticsserver trace, look at the memory consumption for statisticsserver highlighted in red. Pay attention to the PAL (process allocation limit), AB (allocated byte) and U (used) value. When U value is close, equal or bigger then PAL value, this indicates out of memory occurred.

 

Symptoms:

 

[27787]{-1}[-1/-1] 2014-09-25 16:10:22.205322 e Memory ReportMemoryProblems.cpp(00733) : OUT OF MEMORY occurred.

Failed to allocate 32816 byte.

Current callstack:

1: 0x00007f2d0a1c99dc in MemoryManager::PoolAllocator::allocateNoThrowImpl(unsigned long, void const*)+0x2f8 at PoolAllocator.cpp:1069 (

  1. libhdbbasis.so)

2: 0x00007f2d0a24b900 in ltt::allocator::allocateNoThrow(unsigned long)+0x20 at memory.cpp:73 (libhdbbasis.so)

3: 0x00007f2cf78060dd in __alloc_dir+0x69 (libc.so.6)

4: 0x00007f2d0a247790 in System::UX::opendir(char const*)+0x20 at SystemCallsUNIX.cpp:126 (libhdbbasis.so)

5: 0x00007f2d0a1016dc in FileAccess::DirectoryEntry::findFirst()+0x18 at SimpleFile.cpp:511 (libhdbbasis.so)

6: 0x00007f2d0a1025da in FileAccess::DirectoryEntry::DirectoryEntry(char const*)+0xf6 at SimpleFile.cpp:98 (libhdbbasis.so)

7: 0x00007f2d0a04872f in Diagnose::TraceSegmentCompressorThread::run(void*&)+0x26b at TraceSegment.cpp:150 (libhdbbasis.so)

8: 0x00007f2d0a0c0dcb in Execution::Thread::staticMainImp(void**)+0x627 at Thread.cpp:475 (libhdbbasis.so)

9: 0x00007f2d0a0c0f6d in Execution::Thread::staticMain(void*)+0x39 at Thread.cpp:543 (libhdbbasis.so)

Memory consumption information of last failing ProvideMemory, PM-INX=103393:

Memory consumption information of last failing ProvideMemory, PM-INX=103351:

IPMM short info:

GLOBAL_ALLOCATION_LIMIT (GAL) = 200257591012b (186.50gb), SHARED_MEMORY = 17511289776b (16.30gb), CODE_SIZE = 6850695168b (6.37gb)

PID=27562 (hdbnameserver), PAL=190433938636, AB=2844114944, UA=0, U=1599465786, FSL=0

PID=27674 (hdbcompileserve), PAL=190433938636, AB=752832512, UA=0, U=372699315, FSL=0

PID=27671 (hdbpreprocessor), PAL=190433938636, AB=760999936, UA=0, U=337014040, FSL=0

PID=27746 (hdbstatisticsse), PAL=10579663257, AB=10512535552, UA=0, U=9137040196, FSL=0

PID=27749 (hdbxsengine), PAL=190433938636, AB=3937583104, UA=0, U=2352228788, FSL=0

PID=27743 (hdbindexserver), PAL=190433938636, AB=155156312064, UA=0, U=125053733102, FSL=10200547328

Total allocated memory= 198326363056b (184.70gb)

Total used memory     = 163214166171b (152gb)

Sum AB                = 173964378112

Sum Used              = 138852181227

Heap memory fragmentation: 17% (this value may be high if defragmentation does not help solving the current memory request)

Top allocators (ordered descending by inclusive_size_in_use).

1: / 9137040196b (8.50gb)

2: Pool 8130722166b (7.57gb)

3: Pool/StatisticsServer 3777958248b (3.51gb)

4: Pool/StatisticsServer/ThreadManager                                     3603328480b (3.35gb)

5: Pool/StatisticsServer/ThreadManager/Stats::Thread_3                     3567170192b (3.32gb)

6: Pool/RowEngine 1504441432b (1.40gb)

7: AllocateOnlyAllocator-unlimited 887088552b (845.99mb)

8: Pool/AttributeEngine-IndexVector-Single                                 755380040b (720.38mb)

9: AllocateOnlyAllocator-unlimited/FLA-UL<3145728,1>/MemoryMapLevel2Blocks 660602880b (630mb)

10: AllocateOnlyAllocator-unlimited/FLA-UL<3145728,1>                       660602880b (630mb)

1: Pool/RowEngine/RSTempPage 609157120b (580.93mb)

12: Pool/NameIdMapping                                                      569285760b (542.91mb)

13: Pool/NameIdMapping/RoDict 569285696b (542.91mb)

14: Pool/RowEngine/LockTable 536873728b (512mb)

15: Pool/malloc                                                             429013452b (409.13mb)

16: Pool/AttributeEngine 253066781b (241.34mb)

17: Pool/RowEngine/Internal 203948032b (194.50mb)

18: Pool/malloc/libhdbcs.so 179098372b (170.80mb)

19: Pool/StatisticsServer/LastValuesHolder                                  167034760b (159.29mb)

20: Pool/AttributeEngine/Delta 157460489b (150.16mb)

Top allocators (ordered descending by exclusive_size_in_use).

1: Pool/StatisticsServer/ThreadManager/Stats::Thread_3                     3567170192b (3.32gb)

2: Pool/AttributeEngine-IndexVector-Single 755380040b (720.38mb)

3: AllocateOnlyAllocator-unlimited/FLA-UL<3145728,1>/MemoryMapLevel2Blocks 660602880b (630mb)

4: Pool/RowEngine/RSTempPage 609157120b (580.93mb)

5: Pool/NameIdMapping/RoDict 569285696b (542.91mb)

6: Pool/RowEngine/LockTable 536873728b (512mb)

7: Pool/RowEngine/Internal                                                 203948032b (194.50mb)

8: Pool/malloc/libhdbcs.so 179098372b (170.80mb)

9: Pool/StatisticsServer/LastValuesHolder                                  167034760b (159.29mb)

10: StackAllocator                                                          116301824b (110.91mb)

11: Pool/AttributeEngine/Delta/LeafNodes                                    95624552b (91.19mb)

12: Pool/malloc/libhdbexpression.so 93728264b (89.38mb)

13: Pool/AttributeEngine-IndexVector-Sp-Rle                                 89520328b (85.37mb)

14: AllocateOnlyAllocator-unlimited/ReserveForUndoAndCleanupExec            84029440b (80.13mb)

15: AllocateOnlyAllocator-unlimited/ReserveForOnlineCleanup                 84029440b (80.13mb)

16: Pool/RowEngine/CpbTree 68672000b (65.49mb)

17: Pool/RowEngine/SQLPlan 63050832b (60.12mb)

18: Pool/AttributeEngine-IndexVector-SingleIndex                            57784312b (55.10mb)

19: Pool/AttributeEngine-IndexVector-Sp-Indirect                            56010376b (53.41mb)

20: Pool/malloc/libhdbcsstore.so 55532240b (52.95mb)

[28814]{-1}[-1/-1] 2014-09-25 16:09:19.284623 e Mergedog Mergedog.cpp(00198) : catch ltt::exception in mergedog watch thread run(

): exception  1: no.1000002  (ptime/common/pcc/pcc_MonitorAlloc.h:59)

    Allocation failed

exception throw location:

 

 

You can refer 2 solutions below if the HANA system is not ready to switch to embedded stasticsserver for any reason.

Solution A)

 

1) If statistic server is down and inaccessible, you need to kill hdbstatisticsserver pid in OS. Statisticsserver will be restarted immediately by hdb daemon.

 

2) Check memory consumed by statisticsserver:

 

 

3) Check whether the statistics server deletes the old data, go to Catalog -> _SYS_STATISTICS -> TABLES and randomly check table starting with GLOBAL* and HOST* and sort by snapshot_id ascendingly. Ensure the oldest date identical to the retention period.

 

Alternatively, you can run command: select min (snapshot_id) from _SYS_STATISTICS.<TABLE>

 

Eg:


 

 

 

4) Check the retention period of each tables in Configuration -> Statisticsserver -> statisticsserver_sqlcommands


eg:

30 days for HOST_WORKLOAD

 

5) If old data more than 30 days (or we want to delete old data by shorten the retention period), follow 1929538 - HANA Statistics Server - Out of Memory -> Option 1:


Create the procedure using the file attached on note 1929538 and run call set_retention_days(20);

 

6) Once done, you’ll see old data with more than 20days get deleted :

 

Memory consumption for statisticsserver reduced:


Also, the min snapshot_id get updated, which is 20days before the retention period:



7) You can reset the retention period to default value anytime if you want, by calling call set_retention_days(30);or restore every SQL command to default in statisticsserver_sqlcommands.


Solution B)

i) Follow 1929538 - HANA Statistics Server - Out of Memory and increase allocationlimit for statisticsserver. This can be done only when statisticsserver is up and accessible. Otherwise, you need to kill and restart it.

 

 

 

One good script HANA_Histories_RetentionTime_Rev70+ from Note 1969700 - SQL statement collection for SAP HANA provides a good overview of Retention time.


My 2 cents worth, for any statisticsserver OOM error, always check the memory usage of statisticsserver to ensure obselete data get deleted after retention period instead of increasing the allocation limit for statisticsserver blindly.


Additionally, you also can refer to 2084747 - Disabling memory intensive data collections of standalone SAP HANA statisticsserver to disable data collection that consume high memory.


Hope it helps,


Thanks,

Nicholas Chang






Finding Excitement in the World of Data Processing

$
0
0


Before my tenure at SAP I worked in the sales group of a business intelligence (BI) startup and I used to pitch-hit for our under-staffed training department. This meant that when they were in a bind, I’d occasionally jump in to do on-site training sessions with a new customer deploying our BI software.

 

While I enjoyed showing the solutions without the pressure of having to close a sales deal, I always found the database section, where I did a relational database 101 overview and connected our software, to be quite tedious.

 

Jump forward to today’s hyper-connected world, where everything is digitized, fueling new data-driven business models and there’s a lot more to be excited about.

 

It’s not the individual advancements in data processing technology that I’m jazzed about… it’s what happens when you combine the data from devices, sensors and machines, creating inventive scenarios and adding unique business value that I really appreciate – especially given the expanding data challenges organizations face.

 

The changing world of data.jpg
As an example if a software provider or enterprise customer strings together a sensor with an embedded or remote database, adds real-time event processing software and a data warehouse with in-memory computing and then tosses in predictive analytics for good measure – they have a great recipe for:
- A smart vending machine that can deliver user recommendations and transaction history, or tell the candy supplier when a refill is needed. 
- Intelligent plant equipment that captures its own usage information and provides proactive repair warnings based on and historical failure data.
- Real-time fleet management systems that calculate the optimal distribution of work to maximize efficiency - distributing work to fleet assets in real time.

 

Add Cloud and mobile and of course all that exciting data management utility and information is available anywhere, anytime with lower TCO…

 

DM Manufacturing.jpg

There seems to be an infinite number of operational situations, processes and business models where end-to-end data management creates new services and revenue streams delivering customer value.

 

And that’s exciting… 

 

If you’re interested in exploring more use cases like:
- Real-time problem solving
- Smart metering
- Real-time promotions
- Pattern and customer profitability analysis
…then feel free to navigate The Changing World of Data Solution Map,  our Data Management Solution Brief, or reach out to our OEM team. Many partners are already embedding SAP data management solutions with their offerings to reduce their time to market, differentiate their solution and open up new revenue opportunities.

 

Get the latest updates on SAP OEM by following @SAPOEM on Twitter. For more details on SAP OEM Partnership and to know about SAP OEM platforms and solutions, visit us www.sap.com/partners/oem

In-Memory Data Management 2014 Wrap-up

$
0
0

I just reached the final credits of In-Memory Data Management (2014) - Implications on Enterprise Systems and I’d like to share my thoughts for each session.

 

I won’t explain the content for each week (you can found a very well explanation here but I’ll give my impressions, what I liked or learned. It’s totally personal, you may found other topics more interesting.

 

Let’s go:

 

Week 1

Lecture 1 - History of Enterprise Computing

When you start to hear a senior man with white hair talking about tape storage you may think “what I’m doing? I just bought my very modern smartphone and wasting my time hearing that man talking about store information in… tapes?!”. It’s not true in this case. I always like to hear Platter, it’s like a Jedi Master teaching. This introduction is very important to understand the motivation and birth of in-memory database.

 

Lecture 2 - Enterprise Application Characteristics

In my ABAP classes I always teach about OLAP/OLTP and the paradigm to have separated machine with different tuning for each one. Here I learnt a different history.

 

Lecture 3 - Changes in Hardware

How cheap memories, fast network and affordable servers able in-memory computing.

 

Lecture 4 - Dictionary Encoding

Here is one of the key points of SAP HANA, column storage. Here Plattner explain columnar storage and start to talk about compression.

 

Lecture 5 - Architecture Blueprint of SassouciDB

A very quick explanation about an academic and experimental database.

 

Week 2

Lecture 1 - Compression

It’s another key point for SAP HANA. You will learn about compression technics and yes, you will start to do some math to compare them.

 

Lecture 2 - Data Layout

More detail about row vs. column data storage. Pros and cons for each approach and a hybrid possibility.

 

Lecture 3 - Row v. Column Layout (Excursion)

Here we have more of Professor Plattner giving more information about data layout.

 

Lecture 4 - Partitioning

As a geek I only heard about partition when I want to install two operational systems in the same machine. Here I learned a very powerful technic to help parallelism reach higher levels.

 

Lecture 5 - Insert

Insert command. Under the hood.

 

Lecture 6 - Update

Lots of things to modify, re-ordenate and re-write.

 

Lecture 7 - Delete

Not delete, left behind.

 

Lecture 8 - Insert-Only

Worry about the future without forget the past.

 

Week 3

Lecture 1 - Select

Projection, Cartesian Product and Selectivity. All the beautiful theory about retrieving data.

 

Lecture 2 - Tuple Reconstruction

Retrieve a tuple in a row database: piece of cake. Retrieve a tuple in a colomn database: pain in the …

 

Lecture 3 - Scan Performance

Full table scan: row versus column layout. Show me the numbers!

 

Lecture 4 - Materialization Strategies

Materialization: when the attribute vector and dictionary mean something. Here you will learn two strategies for materialization during a query: early and later materialization.

 

Lecture 5 - Differential Buffer

I special buffer to help speed up write operations. Do you remember the insert-only paradigm? It’s about “worry about the future”.

 

Lecture 6 - Merge

When the differential buffer becomes main partition. Do you remember the insert-only paradigm? It’s about “without forget the past”.

 

Lecture 7 - Join

Once you learn that retrieve a tuple in a column layout is a pain, you can imagine what’s doing a join. Here you will know why.

 

Week 4

Lecture 1 - Parallel Data Processing

Very good lesson about parallel data processing. The lecture and reading material try to cover hardware and software aspects of parallelism. Highlight to map reduce. I highly recommend you deep into.

 

Lecture 2 - Indices

Presenting the indices of indices: inverted indices. “Using this approach, we reduce the data volume read by a CPU from the main memory by providing a data structure that does not require the scan of the entire attribute vector.” (from the reading material, chapter 18).

 

Lecture 3 - Aggregate Functions

Coming from old-school ABAP generation, aggregate functions still causes some itches in my ears. However, with push-down concept everything changed. Can old dog still can learn new tricks?

 

Lecture 4 - Aggregate Cache

In the past everything was simple: storage in disk, cache in memory. Today, storage in memory and cache in.. memory too!? Why do I need cache using in-memory database? Cache some chewed data, here is aggregate cache.

 

Lecture 5 - Enterprise Simulations

Answer insanity-fast a query is only part of the game. Now, enterprise simulations are possible. Change some variables and see the result. Ok, it’s not that simple, but it’s awesome anyway!

 

Lecture 6 - Enterprise Simulations on Co-processors (Excursion)

Awesomeness of enterprise simulation with co-processors. For who born before internet might remember co-processor 387, “almost” the same. In this presentation we see how co-processors can help high intensive calculation processing.

 

Week 5

Lecture 1 - Logging

If you think that logging is just to check what happened in the past or to check who was responsible to change the value that causes the highest incident in production yesterday, it’s better to think twice. Logging have a very important role in recovery process.

 

Lecture 2 - Recovery

The first think that everyone try to realize when know about in-memory databases is “if power goes down? All my database data will be swiped out?”. Here you learn that it’s right. But you also learn how in-memory database overcome that.

 

Lecture 3 - Replication

I remember a very simplistic definition of ACID concept: “All or nothing in”. In this lecture we check “all in” concept applied to in-memory databases. How to guarantee ACID in a database stored at RAM.

 

Lecture 4 - Read-only Replication Demo (Excursion)

Replication in action.

 

Lecture 5 - Hot-Standby

It’s a very hot topic (sorry…I won’t do it again). Hot-standby works together replication in order to guarantee ACID. It’s a good opportunity to you see why we can say that SAP HANA is a very beautiful piece of engineering.

 

Lecture 6 - Hot-Standby Demo (Excursion)

Hot-standby in action.

 

Lecture 7 - Workload Management and Scheduling

SAP HANA is all about speed, including user response. Professor Platner explain the importance to have a very responsive system. Here a quote that summarize it: “we must have to answer to user in the same speed of Excel, otherwise the user will download the data to Excel and work there”.

 

Lecture 8 - Implications on Application Development

What the implications for that special people that develop application to users? Code push-down (mode business logic to database) and store procedures, yes we’re still talking about ABAP. Those are the biggest paradigm shift for ABAP developers.

 

Week 6

Lecture 1 - Database-Driven Data Aging

Carsten Meyer explain news ideas about archiving and old data.

 

Lecture 2 - Actual and Historical Partitions

Cold data in not about aging, it’s about usage. Nuffsaid Professor Plattner.

 

Lecture 3 - Genome Analysis

In-memory have very huge implications beyond the Enterprise System. Let me bring a excerpt from “High Performance In-Memory Genome Data Analysis” reading material that can help to desmystify HANA as a luxury: “Nowadays, a range of time-consuming tasks has to be accomplished before researchers and clinicians can work with analysis results, e.g., to gain new insights”.

 

Lecture 4 - Showcase: Virtual Patient Explorer (Excursion)

Medical and patient stuff with lots of lots of information.

 

Lecture 5 - Showcase: Medical Research Insights (Excursion)

More medical and patient stuff with lots of lots of information.

 

Lecture 6 - Point-of-Sales Explorer

How In-memory SAP HANA DB help sales analysis. Three tables and 8 billions rows. Featuring The Professor commenting about SAP HANA performance “freaking unbelievable! People are scared!”.

 

Lecture 7 - What’s in it for Enterprises (Excursion)

More benefits to use SAP HANA for Enterprise. Decisions are able to be made in real-time basis.

 

Lecture 8 - The Enterprise Cloud (Excursion)

Bernd Leukert, member of the executive Board of SAP, talking about running business on cloud is much more than upload your files to Dropboxe.

 

As I said, it’s was my impression about each section. I really enjoy that training and it’s helping me a lot to understand other SAP HANA trainings.

 

I consider that as the cornerstone for anyone that decide to work with SAP HANA.

HANA Health Check

$
0
0

     For this blog, I would like to focus on some basic health checks for HANA. These checks can give you a good idea of how your HANA system is performing. We will go through some SQL statements and the thresholds to determine what the status of your HANA system is in. To know how the HANA system is performing, it can allow us to plan ahead and avoid unnecessary system disaster.

 

 

 

System Availability:

 

The following query shows you how many time each service was restarted in the specified hour and date within the analyzed period.

select  to_dats(to_date("SNAPSHOT_ID"))AS"DATE", hour("SNAPSHOT_ID") AS"HOUR",
SUBSTR_BEFORE(T1.
INDEX,RIGHT(T1.INDEX,  5)) AS"HOST"RIGHT(T1.INDEX,5)AS"PORT",  T2.SERVICE_NAME, count ("ALERT_ID") AS"NUM_RESTART"from"_SYS_STATISTICS"."STATISTICS_ALERTS" T1 JOIN"SYS"."M_VOLUMES"
T2
ON SUBSTR_BEFORE(T1.INDEX,RIGHT(T1.INDEX,5))=T2.HOST ANDRIGHT(T1.INDEX, 5)=T2.PORT   WHERE ALERT_ID = '004'  AND
SNAPSHOT_ID >= add_days(now(), -14)
GROUPBY to_date("SNAPSHOT_ID"), hour ("SNAPSHOT_ID"), SUBSTR_BEFORE(T1.INDEX,RIGHT(T1.INDEX,  5)), RIGHT(T1.INDEX,5), T2.SERVICE_NAME ORDERBY to_date("SNAPSHOT_ID") DESChour ("SNAPSHOT_ID") DESC

 

STATUS

THRESHOLDS

RED

Name server is not running

Name server/ Index server had 3 or more restarts in the analyzed period

YELLOW

Statistics server is not running

Name server / Index server had up to 2 restarts in the analyzed period

Remaining servers had 2 or more restarts in the analyzed period

GREEN

All other cases

 

The example below shows that this standalone test system got restarted 1 time on October 22nd, 2 times on October 21st at around 11pm and another 2 times at around 10pm. In total, there are 3 restarts of the indexserver and nameserve in the analyzed period. If the nameserver is currently not running, then this will be rated as RED. To find out rather the database is restarted manually or due to some other reasons, you may go to index server and name server traces to get more information. If you need further assistance, please consider opening an incident with Product Support.


systemAvail3.png

 

Top 10 Largest Non-partitioned Column Tables (records)

The following query displays the top 10 non-partitioned column tables and how many records exist in each.

 

SELECT top 10 schema_name, table_name, part_id, record_count from SYS.M_CS_TABLES where schema_name notLIKE'%SYS%'and part_id = '0' orderby record_count desc, schema_name, table_name

STATUS

THRESHOLD

RED

If tables with more than 1.5 billion records exist.

YELLOW

If tables with more than 300 million records exist.

GREEN

No table has more than 300 million records.


In the threshold chart, it shows that if the column table has more than 300 million records; then it is in yellow rating.This is not yet critical with regards to the technical limit of 2 billion records but you should consider partitioning those tables that are expected to grow rapidly in the future to ensure parallelization and sufficient performance. For more information, please refer to the below SAP Notes or the SAP HANA Administration Guide.

 

Useful SAP Notes:

- 1650394  - SAP HANA DB: Partitioning and Distribution of Large Tables

- 1909763 - How to handle HANA Alert 17: ‘Record count of non-partitioned column-store tables’

 

Top 10 Largest Partitioned Column Tables (records)

This check displays the 10 largest partitioned column tables in terms of the number of records.


select top 10 schema_name, table_name, part_id, record_count

from SYS.M_CS_TABLES

where schema_name notLIKE'%SYS%'and part_id <> '0'

orderby record_count desc, schema_name, table_name

STATUS

THRESHOLD

RED

If table with more than 1.9 billion records exist.

YELLOW

If table with more than 1.5 billion records and below 1.9 billion records exist.

GREEN

No table has more than 1.5 billion records.

 

The recommendation is to consider re-partitioning after it has passed 1.5 billion records as the technical limit is two billion records per table. If table is more than 1.9 billion records, then you should do the re-partitioning as soon as possible. For more information, please refer to the below SAP Notes or the SAP HANA Administration Guide.

 

Useful SAP Notes:

-   1650394  - SAP HANA DB: Partitioning and Distribution of Large Tables

 

Top 10 Largest Column Tables in Terms of Delta size (MB):

This check displays the 10 largest column tables in terms of the size of the delta and history delta stores.


select top 10 schema_name, table_name, part_id, round(memory_size_in_main /(1024*1024),2), round(memory_size_in_delta/(1024*1024),2), record_count, RAW_RECORD_COUNT_IN_DELTA from SYS.M_CS_TABLES

where schema_name notLIKE'%SYS%'

orderby memory_size_in_delta desc, schema_name, table_name

STATUS

THRESHOLD

RED

MEMORY_SIZE_IN_DELTA >10 GB

YELLOW

MEMORY_SIZE_IN_DELTA >=5 GB AND <=10 GB

GREEN

MEMORY_SIZE_IN_DELTA < 5 GB

 

The mechanism of main and delta storage allows high compression and high write performance. Write operations are performed on delta store and changes are taken over from the delta to main store asynchronously during Delta Merge. The column store performs a delta merge if one of the following events occurs:

- The number of lines in delta storage exceeds the specified limit

- The memory consumption of the delta storage exceeds the specified limit

- The delta log exceeds the defined limit

 

Ensure that delta merges for all tables are enabled either by automatic merge or by application-triggered smart merge. In critical cases trigger forced merges for the mentioned tables. For more detail, please refer to the following SAP Note or the SAP HANA Administration Guide.

 


Useful SAP Notes:

-1977314 - How to handle HANA Alert 29: 'Size of delta storage of column-store tables

 

CPU Usage:

To check the CPU usage in relation to the available CPU capacity, you can go to the Load Monitor from SAP HANA Studio.

STATUS

Header 2

RED

Average CPU usage >=90% of the available CPU capacity

YELLOW

Average CPU usage >=75% and < 90% of the available CPU capacity

GREEN

Average CPU usage < 75% of the available CPU capacity

hostCPU.png

 

The Load Graph and the Alert tabs can provide the information of time frame of the high CPU consumption. If you are not able to determine the time frame because the issue happened too long ago, check the following StatisticsServer table which includes historical host resource information up to 30 days:

 

"_SYS_STATISTICS"."HOST_RESOURCE_UTILIZATION_STATISTICS"

 

With the time frame, you may search through the trace files of the responsible process as they will provide indications on the threads or queries that were running at the time. If the high CPU usage is a recurrent issue that is due to scheduled batch jobs or data loading processes, then you may want to turn on the Expensive Statements trace to record all involved statements. For recurrent running background jobs like backups and Delta Merge, you may want to analyze the two system views: "SYS". "M_BACKUP_CATALOG" and "SYS"."M_DELTA_MERGE_STATISTICS" or "_SYS_STATICTICS"."HOST_DELTA_MERGE_STATISTICS"

 

For more information, please refer to the following SAP Note and also the SAP HANA Troubleshooting and Performance Analysis Guide.

 

SAP Note:

- 1909670 - How to handle HANA Alert 5: ‘Host CPU Usage'

 

 

Memory Consumption:

To check the memory consumption of tables compare to the available allocation limit, you may go to the Load Monitor From HANA Studio.

 

memoryUsage2.png

 

STATUS

THRESHOLD

RED

Memory consumption of tables >= 70% of the available allocation limit.

YELLOW

Memory consumption of tables >= 50% of the available allocation limit.

GREEN

Memory consumption of tables < 50% of the available allocation limit.

 

As an in-memory database, it is critical for SAP HANA to handle and track its memory consumption carefully and efficiently; therefore, HANA database pre-allocates and manages its own memory pool. The concepts of the in-memory HANA data include the physical memory, allocated memory, and used memory.

- Physical Memory: The amount of physical (system) memory available on the host.

- Allocated Memory: The memory pool reserved by SAP HANA from the operating system

- Used Memory: The amount of memory that is actually used by HANA database.

 

Used Memory serves several purposes:

- Program code and stack

- Working space and data tables (heap and shared memory) The heap and shared area is used for working space, temporary data, and storing all data tables (row and column store tables).

 

For more information, please refer to the following SAP Note and also the SAP HANA Troubleshooting and Performance Analysis Guide.

 

Useful SAP Note:

- 1999997 - FAQ: SAP HANA Memory

 

HANA Column Unloads:

 

Check Column Unloads on Load Graph under the Load Tab in the SAP HANA Studio. This graph will give you an idea of the time frame of any high activities of column unloads.

Header 1

Header 2

RED

>= 100,000 column unloads

YELLOW

>= 1001 and <100,000 column unloads

GREEN

<=1000 column unloads

 

Column Store unloads indicates the memory requirements exceed the current available memory in the system. In a healthy situation, it could be that the executed code request a reasonable amount of memory and requires SAP HANA to free up memory resources that are used rarely. However, if  there is a high number of table unloads then it will have an impact on the performance as the tables needs to be fetched again from the disk.

There are a couple of things to look for.

 

-  If the unloads happen on the statistics server, then it might be that the memory allocated for statistics server is not sufficient and most of the time it would accompany by Out of Memory errors. If this is the case, refer to SAP Note 1929538 HANA Statistics Server - Out of memory. On the other hand, if the unload motivation is 'Unused resource' then you should increase parameter global.ini [memoryobjects] unused_retention_period.

 

- If the unloads happen on the indexserver server and the reason for the unloads is due to low memory then it could be either of the reasons:

1) The system is not properly sized

2) The table distribution is not optimized

3) Temporary memory shortage due to expensive SQL or mass activity

 

For more detail information on this, please refer to SAP Note 1977207.


1977207 - How to handle HANA Alert 55: Columnstore unloads


License Information:

The view M_LICENSE can show the date that the HANA license will expire. You can also check the HANA license information from HANA Studio, right click the HANA system > Properties > License. If the license expires, the HANA system will be in a lockdown state; therefore, it is important to make sure the license is renewed before it expires.

 

select system_id, install_no, to_date(expiration_date), permanent, valid, product_name, product_limit, product_usage FROM"SYS"."M_LICENSE"

 

HANA database supports two kinds of license keys:

1) Temporary license key:

      - It is valid for 90 days.

      - It comes with a new SAP HANA database. During these 90 days, you should request and apply a permanent license key.

2) Permanent license key:

     - It is valid until the predefined expiration date.

     - Before a permanent license key expires, you should request and apply a new permanent license key.

 

For more information and steps to request for a license, please refer the SAP Note 1899480

 

- 1899480 - How to handle HANA Alert 31: 'License expiry'

Viewing all 927 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>