Category: Data Security

Deploy and Manage Semantic Models Using the XMLA Endpoint

This post is a part of the DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Prep Hub; and this topic falls under these sections: 
Maintain a data analytics solution
--> Implement security and governance
--> Deploy and manage semantic models by using the XMLA endpoint

The XMLA endpoint enables advanced, enterprise-grade management of Power BI semantic models in Microsoft Fabric. It allows analytics engineers to deploy, modify, automate, and govern semantic models using external tools and scripts—bringing full ALM (Application Lifecycle Management) capabilities to analytics solutions.

For the DP-600 exam, you should understand what the XMLA endpoint is, when to use it, what it enables, and how it fits into the analytics development lifecycle.

What Is the XMLA Endpoint?

The XMLA (XML for Analysis) endpoint is a programmatic interface that exposes semantic models in Fabric as Analysis Services-compatible models.

Through the XMLA endpoint, you can:

  • Deploy semantic models
  • Modify model metadata
  • Manage partitions and refreshes
  • Automate changes across environments
  • Integrate with DevOps workflows

Exam note:
The XMLA endpoint is enabled by default in Fabric workspaces backed by appropriate capacity.

When to Use the XMLA Endpoint

The XMLA endpoint is used when you need:

  • Advanced model editing beyond Power BI Desktop
  • Automated deployments
  • Bulk changes across models
  • Integration with CI/CD pipelines
  • Scripted refresh and partition management

It is commonly used in enterprise and large-scale deployments.

Tools That Use the XMLA Endpoint

Several tools connect to Fabric semantic models through XMLA:

  • Tabular Editor
  • SQL Server Management Studio (SSMS)
  • PowerShell scripts
  • Azure DevOps pipelines
  • Custom automation tools

These tools operate directly on the semantic model metadata.

Common XMLA-Based Management Tasks

Deploying Semantic Models

  • Push model definitions from source control
  • Promote models across Dev, Test, and Prod
  • Align models with environment-specific settings

Managing Model Metadata

  • Create or modify:
    • Measures
    • Calculated columns
    • Relationships
    • Perspectives
  • Apply bulk changes efficiently

Managing Refresh and Partitions

  • Configure incremental refresh
  • Trigger or monitor refresh operations
  • Manage large models efficiently

XMLA Endpoint and the Development Lifecycle

XMLA plays a key role in:

  • CI/CD pipelines for analytics
  • Automated model validation
  • Environment promotion
  • Controlled production updates

It complements:

  • PBIP projects
  • Git integration
  • Development pipelines

Permissions and Requirements

To use the XMLA endpoint:

  • The workspace must be on supported capacity
  • The user must have sufficient permissions:
    • Workspace Admin or Member
  • Access is governed by Fabric and Entra ID

Exam insight:
Viewers cannot use XMLA to modify models.

XMLA Endpoint vs Power BI Desktop

FeaturePower BI DesktopXMLA Endpoint
Visual modelingYesNo
Scripted changesNoYes
AutomationLimitedStrong
Bulk editsNoYes
CI/CD integrationLimitedYes

Key takeaway:
Power BI Desktop is for design; XMLA is for enterprise management and automation.

Common Exam Scenarios

Expect questions such as:

  • Automating semantic model deployment → XMLA
  • Making bulk changes to measures → XMLA
  • Managing partitions for large models → XMLA
  • Integrating Power BI models into DevOps → XMLA
  • Editing a production model without Desktop → XMLA

Example:

A company needs to automate semantic model deployments across environments.
Correct concept: Use the XMLA endpoint.

Best Practices to Remember

  • Use XMLA for production changes and automation
  • Combine XMLA with:
    • Git repositories
    • Tabular Editor
    • Deployment pipelines
  • Limit XMLA access to trusted roles
  • Avoid manual production edits when automation is available

Key Exam Takeaways

  • XMLA enables advanced semantic model management
  • Supports automation, scripting, and CI/CD
  • Used with tools like Tabular Editor and SSMS
  • Requires appropriate permissions and capacity
  • A core ALM feature for DP-600

Exam Tips

  • If a question mentions automation, scripting, bulk model changes, or CI/CD, the answer is almost always the XMLA endpoint.
  • If it mentions visual report design, the answer is Power BI Desktop.
  • Expect questions that test:
    • When to use XMLA vs Power BI Desktop
    • Tool selection (Tabular Editor vs pipelines)
    • Security and permissions
    • Enterprise deployment scenarios
  • High-value keywords to remember:
    • XMLA • TMSL • External tools • CI/CD • Metadata management

Practice Questions

Question 1 (Single choice)

What is the PRIMARY purpose of the XMLA endpoint in Microsoft Fabric?

A. Enable SQL querying of lakehouses
B. Provide programmatic management of semantic models
C. Secure data using row-level security
D. Schedule data refreshes

Correct Answer: B

Explanation:
The XMLA endpoint enables advanced management and deployment of semantic models using tools such as:

  • Tabular Editor
  • SQL Server Management Studio (SSMS)
  • Power BI REST APIs

Question 2 (Multi-select)

Which tools can connect to a Fabric semantic model via the XMLA endpoint? (Select all that apply.)

A. Tabular Editor
B. SQL Server Management Studio (SSMS)
C. Power BI Desktop
D. Azure Data Studio

Correct Answers: A, B

Explanation:

  • Tabular Editor and SSMS use XMLA to manage models.
  • ❌ Power BI Desktop uses a local model, not XMLA.
  • ❌ Azure Data Studio does not manage semantic models via XMLA.

Question 3 (Scenario-based)

You want to deploy a semantic model from Development to Production while preserving model metadata. What is the BEST approach?

A. Export and re-import a PBIX file
B. Use deployment pipelines only
C. Use XMLA with model scripting
D. Rebuild the model manually

Correct Answer: C

Explanation:
XMLA enables:

  • Model scripting (TMSL)
  • Metadata-preserving deployments
  • Controlled promotion across environments

Question 4 (Single choice)

Which capability requires the XMLA endpoint to be enabled?

A. Creating reports
B. Editing DAX measures outside Power BI Desktop
C. Viewing model lineage
D. Applying sensitivity labels

Correct Answer: B

Explanation:
Editing measures, calculation groups, and partitions using external tools requires XMLA connectivity.


Question 5 (Scenario-based)

An enterprise team wants to automate semantic model deployment through CI/CD pipelines. Which XMLA-based artifact is MOST commonly used?

A. PBIP project file
B. TMSL scripts
C. DAX Studio queries
D. SQL views

Correct Answer: B

Explanation:
Tabular Model Scripting Language (TMSL) is the standard XMLA-based format for:

  • Creating
  • Updating
  • Deploying semantic models programmatically

Question 6 (Multi-select)

Which operations can be performed through the XMLA endpoint? (Select all that apply.)

A. Create and modify measures
B. Configure partitions and refresh policies
C. Apply row-level security
D. Build report visuals

Correct Answers: A, B, C

Explanation:
XMLA supports model-level operations. Report visuals are created in Power BI reports, not via XMLA.


Question 7 (Scenario-based)

You attempt to connect to a semantic model via XMLA but the connection fails. What is the MOST likely cause?

A. XMLA endpoint is disabled for the workspace
B. Dataset refresh is in progress
C. Data source credentials are missing
D. The report is unpublished

Correct Answer: A

Explanation:
XMLA must be:

  • Enabled at the capacity or workspace level
  • Supported by the Fabric SKU

Question 8 (Single choice)

Which security requirement applies when using the XMLA endpoint?

A. Viewer permissions are sufficient
B. Read permission only
C. Contributor or higher workspace role
D. Report Builder permissions

Correct Answer: C

Explanation:
Managing semantic models via XMLA requires Contributor, Member, or Admin roles.


Question 9 (Scenario-based)

A developer edits calculation groups using Tabular Editor via XMLA. What happens after saving changes?

A. Changes remain local only
B. Changes are immediately published to the semantic model
C. Changes require a dataset refresh to apply
D. Changes are stored in the PBIX file

Correct Answer: B

Explanation:
Edits made via XMLA tools apply directly to the deployed semantic model in Fabric.


Question 10 (Multi-select)

Which are BEST practices when managing semantic models using XMLA? (Select all that apply.)

A. Use source control for TMSL scripts
B. Limit XMLA access to production workspaces
C. Make direct changes in production without testing
D. Combine XMLA with deployment pipelines

Correct Answers: A, B, D

Explanation:
Best practices include:

  • Version control
  • Controlled access
  • Structured deployments

❌ Direct production changes without testing increase risk.


Power BI Workspace roles

Power BI has 4 roles. Those roles, in order of increasing access/capabilities, are Viewer, Contributor, Member, and Admin. Before granting roles to users in your environment, it’s best to have a solid understanding of what each role has access to and is capable of doing.

The table below provides a list of capabilities of each role. As you will see, each roles “absorbs” or “inherits” the capabilities of all the roles below it in the hierarchy – for example, the Contributor can do everything the Viewer can do plus more, and the Member can do everything the Contributor can do plus more.


The Power BI Workspace roles

ViewerContributorMemberAdmin
View dashboards, reports, and workbooks in the workspaceEverything that the Viewer can doEverything that the Contributor can doEverything that the Member can do
Read data from dataflows in the workspaceAdd, edit, delete content workspacesAdd other users as members, contributors, or viewers to the workspaceUpdate and delete the workspace
Row-level security applies to viewersSchedule refreshes and use the on-premises gateway within workspaces Publish and update the workspace appAdd and remove other users of any role from the workspace
Feature dashboards and reports from workspacesShare and allow others to reshape items from the workspace
Have access to the lineage viewFeature the workspace app
Have full access to all datasets within a workspace

A few things to keep in mind regarding roles:

  • Only the Member and Admin roles can perform access related tasks and publish apps.
  • Both Member and Admin roles can update workspaces, but only the Admin role can delete.
  • By default, the Contributor role cannot update apps, but there is a workspace setting that allows Contributors to update apps.
  • Both the Member and Admin roles can add users, but only the Admin role can add other Admins. 
  • A Power BI Pro license is needed to be able to fully utilize the Admin role. 

This article was intended to be an easy read; more detailed information regarding Power BI roles can be found here on the Microsoft site.

Thanks for reading!

Quality Assurance (QA) for Data Projects or Data Applications

This post discusses Quality Assurance (QA) activities for data projects.

What is Quality Assurance (QA)?  Simply put, Quality Assurance, also called QA, Testing or Validation, is about testing an application or solution to ensure that all the stated/promised/expected requirements are met. It is a critically important activity for all software application development or implementations. Data applications are no different. They need to be tested to ensure they work as intended.

QA stands between development and deployment. And QA makes the difference between a delivered product and a high quality delivered product.

There are a number of things to keep in mind when you plan your Quality Assurance activities for data solutions. I present some of them in this post as suggestions, considerations, or prompting questions. The things mentioned here will not apply to all data applications but can be used as a guide or a check.

People / Teams

The number of people and teams involved in a project will vary depending on the size, scope and complexity of the project.

The technical team building the application needs to perform an initial level of validation of the solution.

If there is a Quality Assurance team that performs the validation tasks, then that team will need to perform the “official” validation.

The business analysts and end-users of the application also need to validate. Where possible, work with as many end users as efficiently possible. The more real users you have testing the application, the better the chances of finding issues early.

Where it makes sense, Test IDs that simulate various types of users or groups should be used to help test various usage and security scenarios. This is particularly useful in automated testing.

On large projects where there is a lot to be tested, it is best to break up the testing across multiple people or teams. This will help to prevent testing fatigue and sloppy testing and result in higher quality testing.

Plan ahead to ensure that access for all the relevant users is set up in the testing environments.

Communication

With all the teams and people involved, it is important to have a plan for how they will communicate. Things to consider and have a plan for include:

  • How will teams communicate within? Email, Microsoft Teams, SharePoint, Shared Files, are some options.
  • How will the various teams involved communicate with each other? In other words, how will cross-team communication be handled? As above, Email, Microsoft Teams, SharePoint, Shared Files, are some options.
  • How will issues and status be communicated? Weekly meetings, Status emails or documents, Shared files available on shared spaces are options.
  • How will changes and resolutions be tracked? Files, SDLC applications, Change Management applications are options.
  • How will teams and individuals be notified when they need to perform a task? Manual communication or automated notifications from tools are options.

Data

The most important thing to ensure in data projects is that the data is high quality, particularly the “base” data set. If the base data is incorrect, everything built on top of it will be bad. Of course, the correctness of intermediate and user-facing data is also just as important, but the validation of the base data is critical to achieving the correct data all over.

  • Ensure that table counts, field counts and row counts of key data are correct.
  • Does the data warehouse data match the source data?
  • Test detailed, low level records with small samples of data
  • Test to ensure that the data and the values conform to what is expected. For example, ensuring that there is no data older than 3 years old, or ensuring that there are no account values outside a certain range. The Data Governance Team may become involved in these activities across all projects.

Next in line is the “intermediate” data such as derived metrics, aggregates, specialized subsets, and more. These will also need to be verified.

  • Are the calculated values correct?
  • Are the aggregates correct? Test aggregate data with small, medium and large sets of data
  • Verify metric calculations

Then the user-facing data or data prepared for self-service usage needs to be validated.

  • Does the data on the dashboard match the data in the database?
  • Are the KPIs correctly reflecting the status?

Test the full flow of the data. The validity of the data should be verified at each stage of the data flow – from the source, to the staging, to the final tables in the data warehouse, to aggregates or subsets, to the dashboard.

Take snapshots of key datasets or reports so you can compare results post data migration.

Some additional data prep might be needed in some cases.

  • These include making sure that you have sourced adequate data for testing. For example, if you need to test an annual trend, then it might be best to have at least a year’s worth of data, preferably two.
  • You may need to scramble or redact some data for testing. Often Test data is taken from the Production environment and then scrambled and/or redacted in order to not expose sensitive information.
  • You may need to temporarily load in data for testing. For various reasons, you may need to load some Production data into the QA environment just to test the solution or a particular feature and then remove the data after the testing is complete. While this can be time consuming, sometimes it’s necessary, and it’s good to be aware of the need early and make plans accordingly.

Aesthetics & Representation of Data

Presentation matters. Although the most critical thing is data correctness, how the data is presented is also very important. Good presentation helps with understanding, usability, and adoption. A few things to consider include:

  • Does the application, such as dashboard, look good?  Does it look right? 
  • Are the components laid out properly so that there is no overcrowding?
  • Are the logos, colors and fonts in line with company expectations?
  • Are proper chart options used to display the various types of data and metrics?
  • Is the information provided in a way that users can digest?

Usage

The data application or solution should be user friendly, preferably intuitive or at least have good documentation. The data must be useful to the intended audience, in that, it should help them to understand the information and make good decisions or take sensible actions based on it.

The application should present data in a manner that is effective – easy to access, and easy to understand.

The presentation should satisfy the analytic workflows of the various users. Users should be able to logically step through the application to find information at the appropriate level of detail that they need based on their role.

A few things that affect usability include:

  • Prompts – ensure that all the proper prompts or selections are available to users to slice and filter the data as necessary. And of course, verify that they work.
  • Drill downs and drill throughs – validate that users can drill-down and across data to find the information they need in a simple, logical manner.
  • Easy interrogation of the data – if the application is ad-hoc in nature, validate that users can navigate it or at least verify that the documentation is comprehensive enough for users to follow.

Security

Securing the application and its data so that only authorized users have access to it is critical.

Application security comprises of “authentication”– access to the application, and “authorization” – what a user is authorized to do when he or she accesses the application.

Authorization (what a user is authorized to do within the application) can be broken into “object security” – what objects or features a user has access to, and “data security” – what data elements a user has access to within the various objects or features.

For example, a user has access to an application (authenticated / can log in), and within the application the user has access to (authorized to see and use) 3 of 10 reports (object-level security). The user is not authorized to see the other 7 reports (object-level security) and, therefore, will not have access to them. Now, within the 3 reports that the user has access to, he or she can only see data related to 1 of 5 departments (data-level security).

All object-level and data-level security needs to be validated. This includes negative testing. Not only test to make sure that users have the access they need, but testing should also ensure that users do not have access that they should not have.

  • Data for testing should be scrambled or redacted as appropriate to protect it.
  • Some extremely sensitive data may need to be filtered out entirely.
  • Can all the appropriate users access the application?
  • Are non-authorized users blocked from accessing the application?
  • Can user see the data they should be able to see to perform their jobs?

Performance

Performance of the data solution is important to user efficiency and user adoption. If users cannot get the results they need in a timely manner, they will look elsewhere to get what they need. Even if they have no choice, a poorly performing application will result in wasted time and dollars.

A few things to consider for ensuring quality around performance:

  • Application usage – is the performance acceptable? Do the results get returned in an acceptable time?
  • Data Integration – is the load performance acceptable?
  • Data processing – can the application perform all the processing it needs to do in a reasonable amount of time?
  • Stress Testing – how is performance with many users? How is it with a lot data?
  • How is performance with various selections or with no selections at all?
  • Is ad-hoc usage setup to be flexible but avoid rogue analyses that may cripple the system?
  • Is real-time analysis needed and is the application quick enough?

These items need to be validated and any issues need to be reported to the appropriate teams for performance tuning before the application is released for general usage.

Methodology

Each organization, and even each team within an organization, will have a preferred methodology for application development and change management, including how they perform QA activities.

Some things to consider include:

  • Get QA resources involved in projects early so that they gain an early understanding of the requirements and the solutions to assess and plan how best to test.
  • When appropriate, do not wait until all testing is complete before notifying development teams of issue discovered. By notifying them early, this could make the difference between your project being on-time or late.
  • Create a test plan and test scripts – even if they are high-level.
  • Where possible, execute tasks in an agile, iterative manner.
  • Each environment will have unique rules and guidelines that need to be validated. For example, your application may have a special naming convention, color & font guidelines, special metadata items, and more. You need to validate that these rules and guidelines are followed.
  • Use a checklist to ensure that you validate with consistency from deliverable to deliverable
  • When the solution being developed is replacing an existing system or dataset, use the new and old solutions in parallel to validate the new against the old.
  • Document test results. All testing participants should document what has been tested and the results. This may be as simple as a checkmark or a “Done” status, but may also include things like data entered, screenshots, results, errors, and more.
  • Update the appropriate tracking tools (such as your SDLC or Change Management tools) to document changes and validation. These tools will vary from company to company, but it is best to have a trail of the development, testing, and release to production.
  • For each company and application, there will a specific, unique set of things that will need to be done. It is best if you have a standard test plan or test checklist to help you confirm that you have tested all important aspects and scenarios of the application.

This is not an all-encompassing coverage of Quality Assurance for data solutions, but I hope the article gives you enough information to get started or tips for improving what you currently have in place. You can share your questions, thoughts and input via comments to this post. Thanks for reading!

Creating a Business Intelligence (BI) & Analytics Strategy and Roadmap

This post provides some of my thoughts on how to go about creating a Business Intelligence (BI) & Analytics Strategy and Roadmap for your client or company.  Please comment with your suggestions from your experience for improving this information.

 

When creating or updating the BI & Analytics Strategy and Roadmap for a company, one of the first things to understand is:

Who are all the critical stakeholders that need to be involved?

Understanding who needs and uses the BI & Analytics systems is critical for starting the process of understanding and documenting the “who needs what, why, and when”.

These are some of the roles that are typically important stakeholders:

  • High-level business executives that are paying for the projects
  • Business directors involved in the usage of the systems
  • IT directors involved in the developing and support of the systems
  • Business Subject Matter Experts (SME’s) & Business Analysts
  • BI/Analytics/Data/System Architects
  • BI/Analytics/Data/System Developers and Administrators

 

Then, you need to ask all these stakeholders, especially those from the business:

What are the drivers for BI & Analytics? And what is the level of importance for each of these drivers?

This will help you to understand and document what business needs are creating the need for new or modified BI & Analytics solutions. You should then go deeper to understand … what are the business objectives and goals that are driving these business needs.  This will help you to understand and document the bigger picture so that a more comprehensive strategy and roadmap can be created.

The questions and discussions surrounding the above will require deep and broad business involvement. Getting the perspective of a wide range of users from all business areas that are using the BI & Analytics Systems is critical.  The business should be involved throughout the process of creating the strategy and roadmap, and all decisions should tie back to support for business objectives and goals. And the trail leading to all these decisions must be documented.

Some examples of business drivers include:

  • Gain more insight into who our best customers are and how best to acquire them.
  • Understand how weather affects our sales/revenue.
  • Determine how we can sell more to our existing customers.
  • Understand what causes employee turnover.
  • Gain insight into how we can improve staffing schedules.

 

And examples of business objectives and goals may include things like:

  • Increase corporate revenues by 10%
  • Grow our base of recurring customers
  • Stabilize corporate revenues over all seasons
  • Create an environment where employees love to work
  • Reduce payroll costs without a reduction in staff, for example, reduce turnover.

 

Then, turn to understanding and documenting the current scenario (if not already known). Identify what systems (including data sources) are in place, who are using them (and why and how), what capabilities do they offer, what are the must-haves, and what are the pain points and positive highlights.

Also, you will need to determine the current workload (and future workload if it can be determined) of the primary team members involved in developing, testing, and implementing BI & Analytics solutions.

This will help you understand a few things:

  • Some of the highest priority needs of the users
  • Gaps in capabilities and data between what is needed and what is currently in place (including an understanding of what is liked and disliked about the current systems)
  • Current user base knowledge and engagement
  • IT knowledge and skills
  • Resource availability – when are people available to work on new initiatives

 

What are the options and limitations?

  • Can existing systems be customized to meet the requirements?
  • Can they be upgraded to a new version that has the needed functionality?
  • Do we need to consider adding a new platform or replacing one or more of the existing systems with a new platform?
  • Can we migrate from/integrate one system to/with another system that we already have up and running?
  • Are any of our current systems losing vendor support or require an upgrade for other reasons? Has the pricing changed for any of our software applications?
  • What options does our budget permit us to explore?
  • What options do our knowledge and skills permit us to explore?

 

Once you have identified these items …

  • Identify and engage stakeholders, and document these roles and the people
  • Identify and document business drivers, objectives and goals
  • Understand and document the current landscape – needs (including must-haves), technology, gaps, users, IT staff, resource availability, and more
  • Identify and document options – based on current landscape, technology, budget, staff resources, etc.

… you can develop a “living” Strategy and Roadmap for BI & Analytics. And when I say “living”, I mean it will not be a static document, but will be fine-tuned over time as new information emerge and as changes arise in business needs, technology, and staff resources.

 

Your Strategy and Roadmap for BI & Analytics should include, but is not limited to:

  • BI & Analytics that will be used to satisfy business drivers, objectives and goals
  • Data acquisition and storage plan for meeting the analytics needs
  • Technology platforms that will be used to process and store data, and deliver the analytics
  • Information about any new technologies that needs to be acquired or implemented, and schedules
  • Roles and Responsibilities for all stakeholders involved in BI & Analytics projects
  • Planned staffing allocations and schedules
  • Planned staffing changes and schedules
  • User training (business users) and Delivery team training (technical implementers & developers for example)
  • List dependencies for each item or set of items

Implementing Reports-To data-level security in Oracle BI (OBIEE)

In a previous post, Implementing data-level security in Oracle BI (OBIEE), I described data-level security and how to implement it in Oracle Business Intelligence (OBIEE).  In this post I will describe a special type of data-level security, called Reports-To security, and how to implement it in OBI.

For Reports-To data-level security, we want to secure data in such a way that we allow a user access only to data for his/her direct and indirect reports. In other words, each user will be able to see data only for people that are below him/her in the organization hierarchical chain.

Take a look at this example diagram:

ReportsTo_Security_Org_Position_Hier

If Reports-To security is applied to this example, Position# 303 would only be able to see information for Position# 409; and Position# 305 would only be able to see information for Position#’s 410, 411, 412; and a final example, Position# 201 would be able to see the information for Position#’s 303, 304, 305, 306, and 409, 410, 411, 412.

I use “Position” as the driving entity in the hierarchy instead of “Employee” because there are times when a position is vacant (no employee) and so it’s better to use the position which will always have a value.  However, you can use Employee if that works better in your scenario or if that’s what your data supports.

Let’s move on to how to implement this type of security.  The steps are similar to the steps in a previous post, Implementing data-level security in Oracle BI (OBIEE), but with some key differences. (Refer to that post for some of the more detailed steps not reiterated in this post.)

First, build a Reports-To data table and create the necessary ETL to ensure that it remains correct and up-to-date.  This table will contain each position (employee/user) and what position (employee) they report to. The data for this table will likely come from your HR system (such as PeopleSoft, Oracle EBS, SAP, Workday, home-grown system, etc.) that contains all the position and employee data.  Using the Organization Position Hierarchy diagram example, the table (REPORTS_TO_DATA) may look something like this:

REPORTS_TO_DATA

Next, create a Session Initialization Block (Init Block) with row-wise Initialization that will be used to get the list of all positions that report to the position of the current user and store them in a defined Target Variable.  If you log in, the Init Block will generate the list with all the positions (or employees) that report to you; and when Jane logs in, the Init Block will generate the list of all the positions (or employees) that report to her.

An important component of the SQL in the Init Block is that it needs to be recursive, because for each person, it needs to retrieve their direct reports, and then retrieve the people reporting to their direct reports, and so on down the line.  Using the above Organization Position Hierarchy diagram example, when the user in Position 202 logs in, the SQL needs to retrieve the positions reporting to 202 (which are 307 & 308), and then recursively retrieve the positions reporting to 307 and 308, and so on. The Target Variable used for storing the values in this example is: REPORTS_TO_POSITIONS

The Init Block, its SQL, and variable definition may look something like this:

Reports_To_Position_InitBlock2

 

Then finally, we need to create the data filters on the appropriate data sets (that need to be secured) using the variable containing the “list of positions” reporting to the current user (REPORT_TO_POSITIONS variable).  The needs to be done for each role that will access the reports that need to be secured by Reports-To security.

REPORTS_TO_Data_Filter

After this is all set, then Reports-To Security will be in effect for the filtered data sets and the reports that use them.

If you need to make it such that each user can only see data for his or her direct reports, the SQL can be modified to remove the recursion, and just return the direct report positions.

One final point … as you would with all changes, but particularly with solutions involving sensitive data, test your solution thoroughly – including making sure to perform both positive and negative testing.

Thanks for reading!

 

Implementing data-level security in Oracle BI (OBIEE)

Data Level Security involves securing the data available in an application in such a way that each user will see only the data that he/she is authorized to see, resulting in each user possibly seeing different results on the same report.   In this post I will describe how to implement data-level security in Oracle Business Intelligence (OBIEE).

Let’s use an example to describe data-level security.  Each user of the BI system works in or is assigned to a particular Business Unit.  Each user is allowed to see only the data for his or her assigned Business Unit.

In our example, the below table lists the 4 users and the Business Unit that each of them works in or is assigned to, and therefore, should have access to.  We will call this the USER_TO_BUSINESSUNIT table.
DataLevelSecurity_UsersBUs

Jane and Xing should only be able to see data for Business Unit BU2000, Bill should be able to access data for both BU3000 and BU4000, and Venkat should be able to access data for BU4000.

Now, we will use the below table as the example data set that we need to secure with the Business Unit data-level security.  We will call this table TRANSACTION_DATA.
DataLevelSecurity_AllData

When data-level security is applied …

Jane and Xing will be able to access/see the following data:
DataLevelSecurity_BU2000

Bill will able to access/see the following data:
DataLevelSecurity_BU3000_and_BU4000

And Venkat will be able to access/see the following data:
DataLevelSecurity_BU4000

So, now let’s move on to how to implement data-level security in OBI to achieve what was described above.

First, ensure that the USER_TO_BUSINESSUNIT table data is correct and up-to-date, and that there is an ETL in place or some other method of keeping that data updated. You want to ensure that if and when a user’s Business Unit changes, it is reflected in this table so that the user will have access to the appropriate data.

Next, create a Session Initialization Block with row-wise Initialization that will be used to get the list of Business Units that a user has access to.

Open the RPD -> Manage -> Variables
ManageVariables

In the Variable Manager -> Action -> New -> Session -> Initialization Block

This needs to be a “Session” Init block so that it will run each time a user logs in, and gets that user’s list of Business Units; and it needs to be row-wise because some users will have more than 1 value returned.

New_Session_InitBlock

In the Session Variable Initialization Block Dialog, enter a Name for the Init Block.

Then click Edit Data Source
InitBlockDialog

In the Data Source dialog, enter the SQL to get the Business Units for the current logged in user.  Click OK when done which closes this window and brings you back to the Session Variable Initialization Block Dialog.

InitBlockSQL

Click Edit Data Target in the Session Variable Initialization Block Dialog.

Enter your Variable name and check “Row-wise initialization”. As mentioned above, we need to select row-wise because our Init Block SQL may return more than 1 value for some users.   For example, when Bill in our example above data logs in, the Initialization Block will return values BU3000 and BU4000, and store them in the Target Variable, “BUSINESS_UNIT”.

You may also check “Use caching” to store the values in cache. Click OK when done.

SessionInitBlock_RowWiseTargetVariable
Then click OK to save the Init Block.

InitBlock_SetupComplete

Next, apply data filter(s) to the appropriate data set(s) for the appropriate role(s) using the Target Variable above.  You may have role(s) specifically used for data-level security and will need to apply it there, but if not, you will need to apply the filters in each role that has access to the datasets/dashboards/reports that you want to apply data-level security to.

Manage -> Identity
ManageIdentity

Go to the Application Roles tab, and select the Application Role to which you would like to apply the data-level security.  In the APplication Role dialog, click Permissions.
IdentityManager_ApplicationRole

In the Permissions dialog, select the layer and data table that you want to apply the data security to, and then enter the appropriate filter.  In this example, you are filtering by BUSINESS_UNIT.  This will cause the data to be filtered to only include each users’ Business Units.
DataFilter

Save your changes.  You have now applied data-level security.  This is what will happen now:

User logs in -> Init Block runs and selects the Business Units associated with the user’s User ID -> Init Block assigns value(s) to the variable BUSINESS_UNIT -> if the user is a member of a role that has data security applied to -and- the user visits the report -> the data filter will be triggered/run -> User only sees data for the Business Units the user is allowed to see.

Look out for my upcoming post on implementing a special type of data-level security: Reports-To Data Level Security.

Thanks for reading!

Oracle Advanced Security Summary

With the expansion of Self-Service BI, BI Teams need to be more vigilant about protecting sensitive data.
This is a summary of options available for protecting data in Oracle databases.
The information in this post was found here and summarized for a quick read: https://docs.oracle.com/database/121/ASOAG/toc.htm

The 3 features available are (1) Transparent Data Encryption, (2) Data Redaction, and (3) Data Masking and Subsetting Pack.
Here is a quick summary.

(1) Transparent Data Encryption (TDE)

  • Encrypt data so only authorized people can see it
  • Use it to protect sensitive data that maybe in an unprotected environment, such backup data sent to a storage facility
  • You can encrypt an individual column or an entire tablespace
  • Applications using encrypted data can function just the same

(2) Data Redaction

  • Enable the redaction (masking) of column data in tables
  • Redaction can be full, partial, based on regular expressions, or random
    • Full redaction: replaces strings with a single blank space ‘ ‘; numbers with zero (0); dates with 01-JAN-01
    • Partial redaction: replaces a portion of the column data; for example SSN: ***-**-1234
    • Regular expressions: can be used to perform partial or full redactions
    • Random: generates random values for display when accessed
  • The redaction takes place at runtime; not in the permanent data stored

(3) Oracle Enterprise Manager Data Masking and Subsetting Pack

  • enables you to create a “safe” development or test copy of the production database

 

Let’s look into some more details …

(1) Transparent Data Encryption (TDE)

  • TDE uses a two-tiered key-based architecture
  • TDE column encryption uses the two-tiered key-based architecture to transparently encrypt and decrypt sensitive table columns. The TDE master encryption key is stored in an external security module, which can be an Oracle software keystore or hardware keystore. This TDE master encryption key encrypts and decrypts the TDE table key, which in turn encrypts and decrypts data in the table column.
  • A Key Management Framework is used for TDE to store and manage keys and credentials.
    • Includes the keystore to store the TDE master encryption keys and the management framework to manage keystore and key operations
    • The Oracle keystore stores a history of retired TDE master encryption keys, which enables you to change them and still be able to decrypt data that was encrypted under an earlier TDE master encryption key.
  • Types of Keystores
    • Software keystores
    • Hardware, or HSM-based, keystores
  • Types of Software Keystores:
    • auto-login software keystores that are local to the computer on which they are created.
    • cannot be opened on any computer other than the one on which they are created.
    • typically used for scenarios where additional security is required while supporting an unattended operation
    • Password-based software keystores
      • protected by using a password that you create. You must open this type of keystore before the keys can be retrieved or used.
    • Auto-login software keystores
      • protected by a system-generated password, and do not need to be explicitly opened; automatically opened when accessed.
      • can be used across different systems; ideal for unattended scenarios.
    • Local auto-login software keystores
  • Steps for configuring a Software Keystore
    • Step 1: Set the Software Keystore Location in the sqlnet.ora File
    • Step 2: Create the Software Keystore
    • Step 3: Open the Software Keystore
    • Step 4: Set the Software TDE Master Encryption Key
    • Step 5: Encrypt Your Data
  • Oracle Database checks the sqlnet.ora file for the directory location of the keystore, whether it is a software keystore or a hardware module security (HSM) keystore.
  • You cannot change an existing tablespace to make it encrypted
  • You can create or modify columns to be encrypted

 

(2) Data Redaction

  • Define data redaction policies to specify what data needs to be redacted
  • Use policy expressions to set whether a user sees the redacted data or the full data
  • Policy Procedures
    • DBMS_REDACT.ADD_POLICY
    • DBMS_REDACT.ALTER_POLICY
    • DBMS_REDACT.ENABLE_POLICY
    • DBMS_REDACT.DISABLE_POLICY
    • DBMS_REDACT.DROP_POLICY
  • Sample scrip
    • BEGIN
      DBMS_REDACT.ADD_POLICY(
      object_schema => ‘hr’,
      object_name => ’employees’,
      column_name => ‘commission_pct’,
      policy_name => ‘redact_com_pct’,
      function_type => DBMS_REDACT.PARTIAL, –partial;  use DBMS_REDACT.FULL for full
      function_parameters => DBMS_REDACT.REDACT_US_SSN_F5,  — many standard params, but it can also be custom
      expression =>  ‘SYS_CONTEXT(”SYS_SESSION_ROLES”,”MGR”) = ”FALSE”’); –allows MGR role to see data
      policy_description => ‘Partially redacts 1st 5 digits in SS numbers’,
      column_description => ‘ssn contains Social Security numbers’);
      END;
      /
  • Use DBMS_REDACT.ALTER_POLICY and action => DBMS_REDACT.ADD_COLUMN to redact multiple columns
  • Redaction takes place on select lists and not on where clauses
  • Be aware of the scenarios when using redacted tables to build other tables or views

 

(3) Oracle Enterprise Manager Data Masking and Subsetting Pack (DMSP)

  • DMSP enables you to create a development or test copy of the production database, by taking the data in the production database, masking this data in bulk, and/or creating a subset of the data, and then putting the resulting masked data and/or subset of data in the development or test copy.
  • You can still apply Data Redaction policies to the non-production database, in order to redact columns
  • Used to mask data sets when you want to move the data to development and test environments.
  • Data Redaction is mainly designed for redacting at runtime for production applications

——–

I hope you found this helpful to get you started on taking the steps to protect your data internally and externally.
You can visit the link I provided above to find more details.

How to generate detailed Oracle BI (OBIEE) Repository Documentation

In this post, I will show the steps for using the OBIEE “Repository Documentation” utility to generate repository (RPD) lineage information.  I will also provide a couple example of how this documentation (output file) can be used.

To access and run the Repository Documentation utility,  from the BI Admin Tool menu, select Tools -> Utilities.

biadmintool_menu_tools_utilities

From the Utilities dialog, select “Repository Documentation”, and click “Execute…”

utilitiesdialog

In the “Save As” dialog, select the destination and enter the name you would like for the output file.

saverepositorydocumentationdialog

When it finishes, it will generate the output csv file.  Note  – this will likely be a large file.  It will contain all your repository objects.

obieerepositoryoutputfile

The RPD documentation file will contain the following columns:
Subject Area, Presentation Table, Presentation Column, Description – Presentation Column, Business Model, Derived logical table, Derived logical column, Description – Derived Logical Column, Expression, Logical Table, Logical Column, Description – Logical Column, Logical Table Source, Expression, Initialization Block, Variable, Database, Physical Catalog, Physical Schema, Physical Table, Alias, Physical Column, Description – Physical Column

You can use this file to quickly track lineage from physical sources to the logical columns to the presentation columns and identify all the logic and variables in between.
You can also use it to identify where and how much a specified table, column, variable, etc. is used which will help you to identify dependencies and know the effect of making changes or deleting elements.

Development, Data Governance, and Quality Assurance teams may find this information useful in this format.

It’s all about the users – Identifying Users for your BI applications / dashboards

One of the first things you will need to do before developing your Business Intelligence (BI) applications or dashboards is … identify who will use it.  You need to identify who will be using the application – what business areas they belong to, what groups they belong to, what are the various functions or roles within those groups, and eventually, who are the actual people.  After identifying the various roles (groups of users typically associated with a business process or function), then you can identify their needs.  Starting any development before knowing who will be using the system could result in a lot of wasted time and effort or a sub-optimal system.  The grouping of information on dashboards, the available functionality and security will be driven by these roles and their respective needs.

After identifying the various functions or roles that users posses, then it is important to understand how each role performs their job functions.  You need to understand what information they need and in what order, how it’s used, and the level of detail required at various stages. With this information, you will determine the dashboards, dashboard pages and their order, the information on each dashboard page and its precedence and level of detail, and what detailed information is needed via drill down. Basically, you will be creating the analytic workflows for the identified roles and the various processes, functions and tasks that they perform.

When performing the above exercise, please be as discrete as possible.  For example, even if someone doubles as an AP/AR Analyst, you should still analyze and plan for 2 separate roles – AP Analyst and AR Analyst – because those are 2 separate functions.  Later, the individual or group can be granted permissions to both roles.  From a security standpoint in general, you will create the necessary BI application roles to support your business roles.  And then assign security based on these roles.

In general, always keep the focus on the users, what they need to accomplish, and the most efficient ways to help them perform their jobs.  When you build the BI security and dashboards to meet those needs and usage scenarios, it will result in higher and faster user adoption.  This will take time, so do not rush the process.  Get detailed information about all the steps in their workflow upfront, document it, and then build around it.  However, on the other hand, you do not have to document to perfection upfront, you can take a more agile approach of developing based on fairly good user profiles to give users working prototypes, and then adjusting as new information and feedback is received from the users.

Good luck identifying your users and their needs as you get your BI project rolling.  And remember, it’s all about the users!

Creating a new Security Realm in OBIEE 11g

When setting up security in OBIEE 11g, you may modify the default security realm (myrealm) that installs with OBIEE.   But even better, you may create a new security realm and leave the default realm untouched.

My preference is to leave the default realm untouched and create a new realm – this is a best practice in my opinion.  I think it is helpful to always be able to go back and look at the features and settings of the default security realm.  And you can name your new security realm more appropriately, such as, ABCIncSecurityRealm.

The link below (Paul Cannon’s Blog) brings you to a blog post about configuring OBIEE to use LDAP authentication, and the first part of the post covers creating the new security realm.  It is very detailed. I used it the first time I created a new security realm.

http://paulcannon-bi.blogspot.com/2012/07/configuring-ldap-authentication-for.html

Good luck creating your new security realm.