BI Administration, Business Intelligence, Reporting February 1, 2020February 1, 2020

OBIEE Agent sending emails to the wrong recipients

We recently ran into an issue where we had an OBI Agent setup to send personalized reports via email to each recipient but some recipients (about 2%) were receiving the wrong email.

A search of Oracle Support produced Document ID # 2119485.1 as a possible solution.

“OBIEE 11g|12c: Agents Send Emails To Incorrect Recipients When Master Trigger Agent Is Present (Doc ID 2119485.1)”

This document recommended applying patch #s 22821787 and 25545058.

However, we are on OBIEE 12c (12.2.1.2.0) and one of the patches seemed to be for 11g only.

Patch # 25545058 seemed to be for 11g only.
Patch # 22821787 was for both 11g and 12c versions.

We applied patch # 22821787, but unfortunately, the issue persisted.

After looking around some more, we realized there was another patch but for the 12.2.1.2.180116 release (found in Document ID # 2395331.1). It didn’t match our version, but we decided to explore it anyway.

“OBIEE 12c : Agent Sending The Incorrect Result (Doc ID 2395331.1)”

That was patch # 27072632 but it turns out that patch was superseded by patch # 27916905.

Our admin team tried to apply patch # 27916905, but it had a conflict with the initial patch # 22821787.

We then backed out patch # 22821787 and applied the bundle patch 27916905.

The patch # 27916905 seems to have resolved the “email going to wrong recipients” issue. Since we applied it, no user has reported they received the wrong email. However, we are not yet 100% sure.

However, we are noticing that some images are not displaying properly which may have been caused by the patch. We are looking into that issue now.

I went through the detailed description of how the patches were found to let you realize that on the Oracle Support site, you may need to do a very thorough search to find any and all patches related to an issue before applying any. The documentation does not necessarily tie them together or they won’t necessarily come up in when you search on the keywords. Note: Before any of the above changes were made, backups were taken so that we could revert to any stage that we wanted to.

Analytics, Data News, Data Science February 1, 2020

Salesforce Einstein Bots

What is a Salesforce Einstein Bot?

According to Salesforce a bot is “a computer program which conducts a conversation via auditory or textual methods.”.

So, before we get more into what a bot is let’s first look at the platform they are created on, Salesforce’s Einstein Analytics.

Salesforce’s Einstein Analytics provides impressive mechanisms that assist organizations and their users of the Salesforce platform to connect, communicate & interpret customer needs. By implementing elements of artificial intelligence, data mining and predictive analytics Salesforce users can get deeper insights into their customers data and begin to build an improved base of knowledge related to their business. With an underlying engine tuned for performance and a presentation layer which can display key details or high level metrics on dashboards Einstein Analytics is the next step in reporting on the health of your sales pipeline, exposing opportunities and providing suggestions to help guide you in identifying & visualizing growth which aligns to your business.

Now, back to bots …

Basically, a bot is a means to facilitate communication between humans and computers with either voice or text and subsequently executing an action tied to the input provided. Bots can learn over time to interact with humans by leveraging Salesforce’s Einstein Analytics platform and your data which resides in Salesforce and respond using Natural Language Processing. According to Wikipedia, “Natural language processing (NLP) is a subfield of linguistics, computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data.”

Why are Salesforce Einstein Bots important?

Organizations are creating and implementing bots at an ever-increasing rate. By creating and implementing a bot an organization can begin to get a handle on support cases resolving many of them very quickly and for some scenarios eliminating the need to open one at all.

Of course, a bot isn’t something that is intended to supplant interaction with a human. However, they can be leveraged to provide a decision path for customers and route customer’s requests quickly based on their general needs while providing a positive initial reception which can augment your current customer service model.

Not only can bots improve productivity of agents by freeing them from having to spend time addressing some of the simpler, frequent requests but can now allow them to focus on more time consuming, complex issues.

Bots in a sense can also be considered another channel for content. However, instead of thinking of new ways to formulate questions from scratch organizations should try to marry current content to bot questions. Reusing content is good but it should rely on content that is based on existing knowledge. This reuse of inhouse documentation & materials will ultimately bring development costs down leading toward a more uniform experience with a higher degree of excellence for the interaction.

How to configure the platform for Salesforce Einstein Bots?

Before you jump in and start creating bots you would be best served by allocating time to plan your bot and consider how it will interact with your customers.

Collaborating and soliciting feedback from agents regarding the issues they experience with customers that are potential areas a bot could address is a good start.

Think about the bot’s persona, what its name should be and how you would like it to convey & reiterate a consistent image of the company overall.

Decisions related to which channels to use, ways in which customers can enter their questions, which licenses are required, which profile to use, whether to provide a menu, what is not in scope for the bot, … etc. should all be worked out in advance of bot development.

At what point does a human need to take over from the bot’s interaction with the customer, if at all?

In Salesforce you will need a Service Cloud license and a Chat or Messaging license. Once that is obtained you will need to turn on Lightning Experience. There is a guided setup flow for Chat you will need to run through. If your organization has Knowledge articles you want to make available to customers through the bot that will need to be enabled also. In your Salesforce Org if you go to Setup and type Einstein Bots in the quick select area it will return Einstein Bots you can click on. Then under Settings there is a toggle to enable Einstein Bots.

When ready make an Embedded Chat button available on your published Salesforce site or community site for your customers to interact with. A Salesforce community site is preferred.

Check out the https://trailhead.salesforce.com/en/home free training to find out more about how to create bots within Salesforce.

Things to consider when maintaining Salesforce Einstein Bots.

Salesforce documentation indicates that the following items also be considered when planning bot creation:

Chat and Messaging licenses support different channels (such as SMS or Facebook Messenger) and might have different requirements.
Each org is provided 25 Einstein Bots conversations per month for each user with an active subscription.
To make full use of the Einstein Bots Performance page, obtain the Service Analytics App.

Business Intelligence Platform, Data Integration, Database Administration, Performance Tuning November 13, 2019

BI Application getting ORA-02391 error

Last week we rolled out a new dashboard that uses a new data source.
In one of our BI environments, the application was throwing an error:
“ORA-02391: exceeded simultaneous SESSIONS_PER_USER limit at OCI call OCISessionBegin“

This is an Oracle Database error, and not an error directly from the BI Application.

For the “ORA-02391: exceeded simultaneous SESSIONS_PER_USER limit” error …
The Cause is: An attempt was made to exceed the maximum number of concurrent sessions allowed by the SESSIONS_PER_USER clause of the user profile.
And the Action for resolution is: End one or more concurrent sessions or ask the database administrator to increase the SESSIONS_PER_USER limit of the user profile.

Turns out the SESSIONS_PER_USER parameter was set too low; it was set to 3 for the user being used to access the database from the BI application. This error could have also been observed from an ETL tool accessing the database with an ID with the same parameter setting.

One of the DBAs bumped this parameter up to 30 for the user, and that resolved the issue.
We requested for this change to be done on the BI application databases in all the environments – Development, Test, QA, and Production.

Although all seems to be well, we will now monitor to see how many sessions the application is using and if there is any negative impact on the source application. This will allow us to determine if we need to make any other adjustments.

Thanks for reading. I hope you found this information useful.

Business Intelligence, Data Analysis, Data Development, Data Governance, Data Integration, Data Quality Assurance, Data Security, Data Strategy, Data Warehousing, Performance Tuning November 6, 2019January 1, 2026

Quality Assurance (QA) for Data Projects or Data Applications

This post discusses Quality Assurance (QA) activities for data projects.

What is Quality Assurance (QA)? Simply put, Quality Assurance, also called QA, Testing or Validation, is about testing an application or solution to ensure that all the stated/promised/expected requirements are met. It is a critically important activity for all software application development or implementations. Data applications are no different. They need to be tested to ensure they work as intended.

QA stands between development and deployment. And QA makes the difference between a delivered product and a high quality delivered product.

There are a number of things to keep in mind when you plan your Quality Assurance activities for data solutions. I present some of them in this post as suggestions, considerations, or prompting questions. The things mentioned here will not apply to all data applications but can be used as a guide or a check.

People / Teams

The number of people and teams involved in a project will vary depending on the size, scope and complexity of the project.

The technical team building the application needs to perform an initial level of validation of the solution.

If there is a Quality Assurance team that performs the validation tasks, then that team will need to perform the “official” validation.

The business analysts and end-users of the application also need to validate. Where possible, work with as many end users as efficiently possible. The more real users you have testing the application, the better the chances of finding issues early.

Where it makes sense, Test IDs that simulate various types of users or groups should be used to help test various usage and security scenarios. This is particularly useful in automated testing.

On large projects where there is a lot to be tested, it is best to break up the testing across multiple people or teams. This will help to prevent testing fatigue and sloppy testing and result in higher quality testing.

Plan ahead to ensure that access for all the relevant users is set up in the testing environments.

Communication

With all the teams and people involved, it is important to have a plan for how they will communicate. Things to consider and have a plan for include:

How will teams communicate within? Email, Microsoft Teams, SharePoint, Shared Files, are some options.
How will the various teams involved communicate with each other? In other words, how will cross-team communication be handled? As above, Email, Microsoft Teams, SharePoint, Shared Files, are some options.
How will issues and status be communicated? Weekly meetings, Status emails or documents, Shared files available on shared spaces are options.
How will changes and resolutions be tracked? Files, SDLC applications, Change Management applications are options.
How will teams and individuals be notified when they need to perform a task? Manual communication or automated notifications from tools are options.

Data

The most important thing to ensure in data projects is that the data is high quality, particularly the “base” data set. If the base data is incorrect, everything built on top of it will be bad. Of course, the correctness of intermediate and user-facing data is also just as important, but the validation of the base data is critical to achieving the correct data all over.

Ensure that table counts, field counts and row counts of key data are correct.
Does the data warehouse data match the source data?
Test detailed, low level records with small samples of data
Test to ensure that the data and the values conform to what is expected. For example, ensuring that there is no data older than 3 years old, or ensuring that there are no account values outside a certain range. The Data Governance Team may become involved in these activities across all projects.

Next in line is the “intermediate” data such as derived metrics, aggregates, specialized subsets, and more. These will also need to be verified.

Are the calculated values correct?
Are the aggregates correct? Test aggregate data with small, medium and large sets of data
Verify metric calculations

Then the user-facing data or data prepared for self-service usage needs to be validated.

Does the data on the dashboard match the data in the database?
Are the KPIs correctly reflecting the status?

Test the full flow of the data. The validity of the data should be verified at each stage of the data flow – from the source, to the staging, to the final tables in the data warehouse, to aggregates or subsets, to the dashboard.

Take snapshots of key datasets or reports so you can compare results post data migration.

Some additional data prep might be needed in some cases.

These include making sure that you have sourced adequate data for testing. For example, if you need to test an annual trend, then it might be best to have at least a year’s worth of data, preferably two.
You may need to scramble or redact some data for testing. Often Test data is taken from the Production environment and then scrambled and/or redacted in order to not expose sensitive information.
You may need to temporarily load in data for testing. For various reasons, you may need to load some Production data into the QA environment just to test the solution or a particular feature and then remove the data after the testing is complete. While this can be time consuming, sometimes it’s necessary, and it’s good to be aware of the need early and make plans accordingly.

Aesthetics & Representation of Data

Presentation matters. Although the most critical thing is data correctness, how the data is presented is also very important. Good presentation helps with understanding, usability, and adoption. A few things to consider include:

Does the application, such as dashboard, look good? Does it look right?
Are the components laid out properly so that there is no overcrowding?
Are the logos, colors and fonts in line with company expectations?
Are proper chart options used to display the various types of data and metrics?
Is the information provided in a way that users can digest?

Usage

The data application or solution should be user friendly, preferably intuitive or at least have good documentation. The data must be useful to the intended audience, in that, it should help them to understand the information and make good decisions or take sensible actions based on it.

The application should present data in a manner that is effective – easy to access, and easy to understand.

The presentation should satisfy the analytic workflows of the various users. Users should be able to logically step through the application to find information at the appropriate level of detail that they need based on their role.

A few things that affect usability include:

Prompts – ensure that all the proper prompts or selections are available to users to slice and filter the data as necessary. And of course, verify that they work.
Drill downs and drill throughs – validate that users can drill-down and across data to find the information they need in a simple, logical manner.
Easy interrogation of the data – if the application is ad-hoc in nature, validate that users can navigate it or at least verify that the documentation is comprehensive enough for users to follow.

Security

Securing the application and its data so that only authorized users have access to it is critical.

Application security comprises of “authentication”– access to the application, and “authorization” – what a user is authorized to do when he or she accesses the application.

Authorization (what a user is authorized to do within the application) can be broken into “object security” – what objects or features a user has access to, and “data security” – what data elements a user has access to within the various objects or features.

For example, a user has access to an application (authenticated / can log in), and within the application the user has access to (authorized to see and use) 3 of 10 reports (object-level security). The user is not authorized to see the other 7 reports (object-level security) and, therefore, will not have access to them. Now, within the 3 reports that the user has access to, he or she can only see data related to 1 of 5 departments (data-level security).

All object-level and data-level security needs to be validated. This includes negative testing. Not only test to make sure that users have the access they need, but testing should also ensure that users do not have access that they should not have.

Data for testing should be scrambled or redacted as appropriate to protect it.
Some extremely sensitive data may need to be filtered out entirely.
Can all the appropriate users access the application?
Are non-authorized users blocked from accessing the application?
Can user see the data they should be able to see to perform their jobs?

Performance

Performance of the data solution is important to user efficiency and user adoption. If users cannot get the results they need in a timely manner, they will look elsewhere to get what they need. Even if they have no choice, a poorly performing application will result in wasted time and dollars.

A few things to consider for ensuring quality around performance:

Application usage – is the performance acceptable? Do the results get returned in an acceptable time?
Data Integration – is the load performance acceptable?
Data processing – can the application perform all the processing it needs to do in a reasonable amount of time?
Stress Testing – how is performance with many users? How is it with a lot data?
How is performance with various selections or with no selections at all?
Is ad-hoc usage setup to be flexible but avoid rogue analyses that may cripple the system?
Is real-time analysis needed and is the application quick enough?

These items need to be validated and any issues need to be reported to the appropriate teams for performance tuning before the application is released for general usage.

Methodology

Each organization, and even each team within an organization, will have a preferred methodology for application development and change management, including how they perform QA activities.

Some things to consider include:

Get QA resources involved in projects early so that they gain an early understanding of the requirements and the solutions to assess and plan how best to test.
When appropriate, do not wait until all testing is complete before notifying development teams of issue discovered. By notifying them early, this could make the difference between your project being on-time or late.
Create a test plan and test scripts – even if they are high-level.
Where possible, execute tasks in an agile, iterative manner.
Each environment will have unique rules and guidelines that need to be validated. For example, your application may have a special naming convention, color & font guidelines, special metadata items, and more. You need to validate that these rules and guidelines are followed.
Use a checklist to ensure that you validate with consistency from deliverable to deliverable
When the solution being developed is replacing an existing system or dataset, use the new and old solutions in parallel to validate the new against the old.
Document test results. All testing participants should document what has been tested and the results. This may be as simple as a checkmark or a “Done” status, but may also include things like data entered, screenshots, results, errors, and more.
Update the appropriate tracking tools (such as your SDLC or Change Management tools) to document changes and validation. These tools will vary from company to company, but it is best to have a trail of the development, testing, and release to production.
For each company and application, there will a specific, unique set of things that will need to be done. It is best if you have a standard test plan or test checklist to help you confirm that you have tested all important aspects and scenarios of the application.

This is not an all-encompassing coverage of Quality Assurance for data solutions, but I hope the article gives you enough information to get started or tips for improving what you currently have in place. You can share your questions, thoughts and input via comments to this post. Thanks for reading!

BI Administration, Business Intelligence Platform, Database Administration, Databases October 26, 2019

BI Application getting ORA-00257 Error

One day this week, we got the following error showing up on our BI dashboards.
“ORA-00257: Archiver error. Connect AS SYSDBA only until resolved.”
This is an Oracle database error (which you may guess based on the “ORA”), and not an error directly from BI application.

If you get this error, it means that the database redo logs are filled up, and cannot be archived due to lack of space on the designated archive area or some other issue. In our case, the “some other issue” was caused by some issues with “commvault”, a software application used for data backup and recovery, among other things.

When this happens, if a user tries to connect to the database, such as the BI Application user in our case, the database will not allow the new connection. The only exception is SYSDBA users will be allowed to connect.

If you are not the database administrator (DBA), you will most likely work with your DBA (as we do) to get this error resolved.
After the issue that caused the problem is resolved and the redo logs are cleared, then the database, and therefore the BI application, will allow new connections as normal.

Thanks for reading and I hope you found this helpful.

Big Data October 13, 2019

Learning Hadoop: The benefits of Hadoop commercial distributions

What are the benefits of using a commercial distribution of Hadoop? And what are the popular commercial distributions of Hadoop?

Hadoop, the preeminent open-source platform for retrieving, processing, storing and analyzing very large amounts of data, has grown tremendously from its core components pioneered by Google into a powerful ecosystem of supporting tools. There are various tools for integrating, streaming, storing, searching, and retrieving data, and tools for security and resource management, among others. And new tools keep emerging at a rapid pace.

Keeping these tools in sync with the versions that are compatible with each other, and keeping patches up-to-date, and plugging in new tools as they become available, and making sure it all works well together, along with the normal management of the Hadoop cluster, can become overwhelming for a small team. Using a commercial distribution of Hadoop alleviates this problem.

Commercial Distributions of Hadoop bundle the various tools of the ecosystem using compatible versions, ensure that they all work together, apply patches, package things in a way that makes the distribution of the software easy to download and install, and provide tools for managing the platform. For production projects created to help meet important business goals, it’s best to use a commercial distribution instead of trying to handle it all on your own. This will allow your team more time to focus on building business solutions instead of solving pesky technology issues.

Some of the most popular commercial distributions of Hadoop (not in any specific order) are:

Cloudera Hadoop Distribution (CDH)
- Some major technology vendors, such as Oracle and Dell, provide their flavors of CDH
Hortonworks Data Platform (HDP)
Amazon Elastic MapReduce
MapR Hadoop Distribution
IBM Open Platform
Microsoft Azure’s HDInsight
Pivotal Big Data Suite
Datameer Professional
Datastax Enterprise Analytics

I will provide details of the various distributions in future posts.

Analytics, Big Data, Data Analysis, Data Integration, Data Science October 10, 2019

Learning Hadoop: The key features and benefits of Hadoop

What are the key features and benefits of Hadoop? Why is Hadoop such a successful platform?

Apache Hadoop, mostly called just Hadoop, is a software framework and platform for reading, processing, storing and analyzing very large amounts of data. There are several features of Hadoop that make it a very powerful solution for data analytics.

Hadoop is Distributed

With Hadoop, from a few to hundreds or thousands of commodity servers (called nodes) can be connected (forming a cluster) to work together to achieve whatever processing power and storage capability is needed. The software platform enables the nodes to work together, passing work and data between them. Data and processing is distributed across nodes which spreads the load and significantly reduces the impact of failure.

Hadoop is Scalable

In the past, to achieve extremely powerful computing, a company would have to buy very expensive, large, monolithic computers. As data growth exploded, eventually even those super computers would become insufficient. With Hadoop, from a few to hundreds or thousands or even millions of commodity servers can be relatively easily connected to work together to achieve whatever processing power and storage capability is needed. This allows a company or project to start out small and then grow as needed inexpensively, without any concern about hitting a limitation.

Hadoop is Fault Tolerant

Hadoop was designed and built around the fact that there will be frequent failures on the commodity hardware servers that make up the Hadoop cluster. When a failure occurs, the software handles the automatic reassignment of work and replication of data to other nodes in the cluster, and the system continues to function properly without manual intervention. When a node recovers, from a reboot for example, it will rejoin the cluster automatically and become available for work.

Hadoop is backed by the power of Open Source

Hadoop is open source software, which means that it can be downloaded, installed, used and even modified for free. It is managed by the renown non-profit group, Apache Software Foundation (ASF), hence the name Apache Hadoop. The group is made up of many brilliant people from all over the world, many of whom work at some of the top technology companies, who commit their time to managing the software. In addition, there are also many developers that contribute code to enhance or add new features and functionality to Hadoop or to add new tools that work with Hadoop. The various tools that have been built over the years to complement core Hadoop make up what is called the Hadoop ecosystem. With a large community of people from all over the world continuously adding to the growth of the Hadoop ecosystem in a well-managed way, it will only get better and become more useful to many more use-cases.

These are the reasons Hadoop has become such a force within the data world. Although there is some hype around the big data phenomenon, the benefits and solutions based on the Hadoop ecosystem are real.

You can learn more at https://hadoop.apache.org

General October 5, 2019January 19, 2026

Welcome to The Data Community!

Welcome to The Data Community! A great online resource for information centered around the broad and important topic of “data”. Thank you for visiting and participating.

“Life generates data. Data reflects life.”

Data Analysis, Data Development, Database Administration December 31, 2018October 5, 2019

Connecting to Microsoft SQL Server database from Oracle SQL Developer

If you work primarily with Oracle databases, you may use SQL Developer. But you may also need to connect to Microsoft SQL Server databases and not necessarily want to install a new front-end database tool, such as Microsoft SQL Server Management Studio (SSMS). You can connect to SQL Server from SQL Developer.

First, download the appropriate JDBC Driver for the version of SQL Server that you need to connect to. Then follow the steps in the video at the link below on the Oracle website.

https://www.oracle.com/technetwork/developer-tools/sql-developer/sql-server-connection-viewlet-swf-089886.html

Good luck.

Business Intelligence, Business Intelligence Platform August 7, 2018October 5, 2019

Error downloading data to Excel in OBIEE 12c

If you get the error …

“There was an error processing your download. Please check with your administrator.”

… when Exporting / Downloading data from an analysis in OBIEE, then this post might be helpful.

In OBIEE, you have a few options for exporting / downloading data from an analysis / report as shown below. You can export / download to PDF, Excel 2007+, PowerPoint 2007+, Web Archive (.mht), or in CSV, Tab Delimited, or XML data formats.

OBIEE_Export_Analysis_Data

If you are trying to download data from an analysis and get this error …

OBIEE_Export_to_Excel_Failed

“There was an error processing your download. Please check with your administrator.”
then try the following fix. Note, you might find that you are able to download to CSV, but get the error when you try to download to Excel.

Edit the config.xml file
Located at the directory specified below (for OBIEE 12c):
ORACLE_HOME/user_projects/domains/bi/config/fmwconfig/biconfig/OBIJH

FYI, this is the location of the config.xml file in OBIEE 11g
[MW_HOME]/instances/instance1/config/OracleBIJavaHostComponent/coreapplication_obijh1

Locate the section and the parameter within that section. It may look like below, but there could be more or less parameters within the section than shown here.

<XMLP>
<InputStreamLimitInKB>8192</InputStreamLimitInKB>
<ReadRequestBeforeProcessing>true</ReadRequestBeforeProcessing>
</XMLP>

Change the value of the InputStreamLimitInKB parameter to 0 as shown below…

<XMLP>
<InputStreamLimitInKB>0</InputStreamLimitInKB>
<ReadRequestBeforeProcessing>true</ReadRequestBeforeProcessing>
</XMLP>

Restart the OBIEE services and try your export again.

If it succeeds, then we know for sure that altering this parameter fixes it.
Setting the above parameter value to zero (0) means that there is no limit. So, you may now go back and modify the value to a number that is suitable for your environment, such as, 8192, 15384, 65536, etc.

The Data Community

Exam Prep Hubs available on The Data Community

Exam Prep Hub for AI-900: Microsoft Azure AI Fundamentals

Exam Prep Hub for PL-300: Microsoft Power BI Data Analyst

Exam Prep Hub for DP-600: Implementing Analytics Solutions Using Microsoft Fabric