Blogs

Blogs

Data Visualization

Get the right information, with visual impact, to the people who need it

Data Visualization | Learn SAS | Programming Tips

Rick WicklinDecember 16, 2024 0

A normal Christmas tree

O Christmas tree, O Christmas tree, How lovely are your branches! SAS programmers have a long history of creating yuletide-themed graphics. Christmas trees are a popular image because of their simplicity. I admit that I have indulged more than once in this holiday tradition: An old-school ASCII art image A

Read More

Advanced Analytics | Analytics | Data Management | Data Visualization

Der Einsatz von ModelOps für skalierbare Analytics im öffentlichen Sektor

Javier López Gómez

Javier López GómezNovember 27, 2024 0

La importancia de tener buenos datos de entrada en los modelos analíticos

En la actualidad, los modelos analíticos son herramientas esenciales para tomar decisiones basadas en datos. Desde prever tendencias hasta optimizar operaciones, los modelos analíticos dependen en gran medida de la calidad de los datos de entrada. La precisión, integridad y relevancia de estos datos son cruciales para obtener resultados confiables

Read More

SAS Spain | Spanish

Banking | Communications | Energy & Utilities | Government | Insurance | Life Sciences | Manufacturing | Retail

Analytics | Data Visualization

Rick WicklinNovember 25, 2024 0

Order variables by using a loading plot

The article "Order two-dimensional vectors by using angles" shows how to re-order a set of 2-D vectors by their angles. Because angles are on a circle, which has no beginning and no end, you must specify which vector will appear first in the list. The previous article finds the largest

Read More

Analytics | Data Management | Data Visualization

Danny Sprukulis

Danny SprukulisNovember 23, 2024 0

Project status and optimization in SAS Viya: Here’s what you need to know

In SAS Viya, users can customize a project management environment by using a file that contains metadata about the organization’s project progress. This process allows the management team to track and interact with the project’s ongoing steps.

Read More

Data Visualization | Learn SAS | Programming Tips

Rick WicklinOctober 28, 2024 0

Automate the creation of a range attribute map in SAS

In SAS, range attribute maps enable you to specify the range of values that determine the colors used for graphical elements. There are various examples that use the GTL to define a range attribute map, but fewer examples that show how to use a range attribute map with PROC SGPLOT.

Read More

Advanced Analytics | Analytics | Artificial Intelligence | Data Visualization | Machine Learning

Falko SchulzAugust 22, 2024 0

Unveiling Oceanus: Harnessing SAS Visual Analytics to combat illegal fishing networks

As part of this year's IEEE Visual Analytics Science and Technology (VAST) Challenge, a group of SAS data scientists puit SAS Viya and related machine learning tools to the ultimate test - to identify individuals in a complex fishing network. Excitedly, the team received the Honorable Mention Award for Breadth of Investigation!

Read More

Government | Insurance

Data Visualization | Machine Learning | Work & Life at SAS

Figure 3: A huge correlation matrix I printed in Python to detect collinearity in the data (when two variables explain the same thing, rendering one of them redundant!). Darker squares mean two variables are highly correlated.

Ava KlissourasJuly 31, 2024 0

Integrating SAS and Python: An intern's journey of growth

Learn how an intern integrated SAS Viya® and open-source code (Python) into a Machine Learning project to combine their strengths within the context of predictive modeling, and to show off the variety of ways this integration can be accomplished.

Read More

Data Visualization

Cesca AraulloJuly 2, 2024 0

Building a cultural wordbook via SAS Visual Text Analytics

SAS Visual Text Analytics can easily analyze similar words and phrases coming from various cultural heritage-related documents to construct a heritage wordbook that cultural workers can use to identify what relevant conservation technique to use on a structure/artifact.

Read More

Analytics | Data Visualization

Rick WicklinJune 19, 2024 0

Scale a density curve to match a histogram

This article discusses how to scale a probability density curve so that it fits appropriately on a histogram, as shown in the graph to the right. By definition, a probability density curve is scaled so that the area under the curve equals 1. However, a histogram might show counts or

Read More

Analytics | Data Visualization | Learn SAS

Rick WicklinJune 17, 2024 0

A bootstrap confidence interval for an R-square statistic

A previous article discusses a formula for a confidence interval for R-square in a linear regression model (Olkin and Finn (1995) "Correlations redux", Psychological Bulletin) The formula is useful for large data sets, but should be used with caution for small samples. At the end of the previous article, I

Read More

Analytics | Data Visualization | Programming Tips

Rick WicklinJune 10, 2024 0

The distribution of the R-square statistic

A SAS analyst ran a linear regression model and obtained an R-square statistic for the fit. However, he wanted a confidence interval, so he posted a question to a discussion forum asking how to obtain a confidence interval for the R-square parameter. Someone suggested a formula from a textbook (Cohen,

Read More

Data Visualization | Fraud & Security Intelligence

Onur DincJune 6, 2024 0

Using centrality metrics to detect illicit financial flows

Detecting illicit financial flows require much more than using traditional business methods. At this point, using centrality metrics in investigation and analytical models will provide wider detection approaches.

Read More

Analytics | Data Visualization | Learn SAS

Rick WicklinJune 3, 2024 0

Visualize a multivariate regression model when using spline effects

A SAS analyst read my previous article about visualizing the predicted values for a regression model that uses spline effects. Because the original explanatory variable does not appear in the model, the analyst had several questions: How do you score the model on new data? The previous example has only

Read More

Advanced Analytics | Data Visualization

Kevin ScottMay 24, 2024 0

Identifying time delays in batch manufacturing for accurate anomaly detection

Batch manufacturing involves producing goods in batches rather than in a continuous stream. This approach is common in industries such as pharmaceuticals, chemicals, and materials processing, where precise control over the production process is essential to ensure product quality and consistency. One critical aspect of batch manufacturing is the need to manage and understand inherent time delays that occur at various stages of the process.

Read More

Data Visualization | Programming Tips

Rick WicklinMay 22, 2024 0

Create filled density plots in SAS

A SAS programmer wanted to visualize density estimate for some univariate data. The data had several groups, so he wanted to create a panel of density estimate, which you can easily do by using PROC SGPANEL in SAS. However, the programmer's boss wanted to see filled density estimates, such as

Read More

Advanced Analytics | Analytics | Artificial Intelligence | Data Management | Data Visualization

Amaya CerezoMay 7, 2024 0

Construyendo el ‘Máster Data Scientist’: el jedi de los datos al servicio de la estrategia

En los últimos años, la ciencia de datos ha experimentado un crecimiento exponencial y se ha convertido en un pilar fundamental para las estrategias de las organizaciones en todas las industrias. Sin embargo, para los data scientist experimentados, el panorama del dato se encuentra en un proceso de cambio constante.

Read More

SAS Spain | Spanish

AgTech | Banking | Communications | Education | Energy & Utilities | Government | Health Care | Hospitality | Insurance | Life Sciences | Manufacturing | Retail | Sports & Entertainment | Travel

Data Visualization | Learn SAS | Programming Tips

Rick WicklinMay 6, 2024 0

Visualize patterns of missing values

Years ago, I wrote an article that showed how to visualize patterns of missing data. During a recent data visualization talk, I discussed the program, which used a small number of SAS IML statements. An audience member asked whether it is possible to construct the same visualization by using only

Read More

Data Visualization | Programming Tips

Rick WicklinApril 1, 2024 0

Add a second axis to a SAS graph

Recently, I saw a scatter plot that displayed the ticks, values, and labels for a vertical axis on the right side of a graph. In the SGPLOT procedure in SAS, you can use the Y2AXIS option to move an axis on the right side of a graph. Similarly, you can

Read More

Data Visualization | Learn SAS

Rick WicklinFebruary 28, 2024 0

Using colors to visualize groups in a bar chart in SAS

I sometimes see analysts overuse colors in statistical graphics. My rule of thumb is that you do not need to use color to represent a variable that is already represented in a graph. For example, it is redundant to use a continuous color ramp to represent the lengths of bars

Read More

Analytics | Data Visualization | Programming Tips

Chris Hemedinger

Chris HemedingerFebruary 27, 2024 0

Visualized: US Currency in circulation, past and present

This phenomenon has been in the news recently, so I've updated this article that I originally published in 2017. The paper currency in circulation in the US is mostly $100 bills. And not just by a little bit -- these account for 34% of the notes by denomination and nearly

Read More

Data Visualization | Programming Tips

Peter Styliadis

Peter StyliadisJanuary 19, 2024 0

My Family of Four's Monthly Water Usage (Gallons) Compared to the Town of Cary's Average

Have you ever been curious about your monthly water consumption and how it compares to others in your community? Recently, I had this question and decided to get ahold of my family's water usage data for analysis. Harnessing the power of data visualization, I compared my family of four's monthly

Read More

Analytics | Data Visualization | Learn SAS | Programming Tips

Rick WicklinJanuary 3, 2024 0

Top 10 posts from The DO Loop in 2023

In 2023, I wrote 90 articles for The DO Loop blog. My most popular articles were about SAS programming, data visualization, and statistics. In addition, several "general interest" articles were popular, including my article for Pi Day and an article about AI chatbots. If you missed any of these articles,

Read More

Advanced Analytics | Analytics | Data Management | Data Visualization | Learn SAS | Students & Educators | Work & Life at SAS

Adriana RojasDecember 19, 2023 0

"Cada vez existen más asignaturas vinculadas a temas analíticos en todos los sectores”

La información certera es la base sobre la que se edifican las empresas, especialmente en un contexto en el que la preparación y la resiliencia son cada vez más importantes. Con el aumento en la cantidad de datos disponibles y la necesidad de aprovecharlos para tener mejores resultados, también hemos

Read More

SAS Spain | Spanish

Data Visualization | SAS Administrators

soyongyunDecember 13, 2023 0

알아 두면 유용한 SAS Viya 4의 편리한 기능 – Logging & Monitoring

클라우드 기반 AI 분석 플랫폼인 SAS Viya 4에는 여러 가지 유용한 기능이 있습니다. 이번 글에서는 SAS Viya 4를 위한 Logging & Monitoring 기능에 대해 소개 드리겠습니다. 1. Logging & Monitoring 이란 무엇인가? Logging과 Monitoring은 해석 그대로, 해당 서비스에 대한 로그 기록과 상태를 시각적으로 표시해주는 것을 의미합니다. 기존 SAS Viya

Read More

Advanced Analytics | Analytics | Artificial Intelligence | Data Management | Data Visualization | Machine Learning | SAS Administrators

小林泉December 8, 2023 0

データ分析プロセス全体を管理～自己組織的に育てるナレッジのカタログ化とは

自己組織化とは、自然界において個体が全体を見渡すことなく個々の自律的なふるまいをした結果、秩序だった全体を作り出すこと 2010年から存在した解決アイディアがついに実現可能に今から遡ること十数年前の2010年頃、支援をしていた大手製造業の会社ではすでにデータ分析スキルの社員間でのばらつきと組織全体のスキルの向上、データ分析作業の生産性の向上、人材のモビリティへの耐性としてのデータ分析業務の標準化が課題となっていました。当時ご相談をいただいた私を含むSASの提案チームは、SASが提供するアナリティクス•ライフサイクル•プラットフォームを活用することで、その問題を支援できることがすぐにわかりました。つまり、ビジネス課題から始まり、利用データ、データ探索による洞察、データ加工プロセス、予測モデリングプロセス、モデル、そしてそれをアプリケーションに組み込むディシジョンプロセスという、一連のアナリティクス•ライフサイクルにまたがるすべての作業を電子的に記録し、全体のプロセスそのものをモデリングし、利活用することで、自己組織的にナレッジが蓄積され、且つ活用されるということです。しかし、当時のSASだけではない周辺のIT環境、すなわちPCやアプリケーションアーキテクチャなどのインフラ、データの所在、セキュリティ管理などがサイロ化していること、またSAS以外のModelOps環境もシステムごとにアーキテクチャがバラバラすぎたこと、また、お客様社内のデータリテラシーそのものもまだ課題が多かったため、SASを中心としても、実現にはあまりにも周辺の開発コストがかかりすぎたために、提案を断念しました。時代は変わり昨今、クラウド技術の採用およびそれに伴うビジネスプロセスの変革と標準化が急速に進んでいます。それに歩調を合わせるように、SASの製品も、上記の当時から市場をリードしてきたMLOpsフレームワークをDecisionOpsへと昇華させ、クラウド技術を最大活用すべく、クラウドネイティブなアーキテクチャおよび、プラットフォームとしての一貫性と俊敏性を高めてきました。そしてついに最新版のSAS Viyaでは、アナリティクスライフサイクル全体にわたり、データからデータ分析プロセス全体の作業を電子的に記録し、管理し、活用することが可能となりました。自己組織的にナレッジを蓄積活用するデータ分析資産のガバナンス昨今のデータマネージメントの取り組みの課題詳しくはこちらのブログをご参照いただきたいのですが、多くのケースで過去と同じ過ちを繰り返しています。要約すると、データ分析文化を醸成したい、セルフサービス化を広めたいという目的に対しては、ある1時点のスナップショットでの完成を目的としたデータカタログやDWH/DMのデータモデル設計は問題の解決にはならないということです。必ず5年後にまた別の担当者やプロジェクトが「これではデータ分析しようにもどのデータを使えばわからない、問題だ、整備しよう」となります。では解決策はなんでしょうか。静的な情報を管理したり整備するのではなく、日々変わりゆく、どんどん蓄積され、評価され、改善、進化し続ける、データ分析業務に関わるすべての情報を記録統制することです。つまり、以下の三つのポイントを実現することです。各ポイントの詳細は後段でご紹介しています。ポイント①あらゆるデータ分析資産（ナレッジ）を管理ポイント②データ品質管理の自動化・省力化とガバナンスポイント③社内ソーシャルの力による自己組織的情報の蓄積まずは、それぞれが何を意味しているかを説明する前に、これらを実現するとどのような世界になるのかをユーザーの声によって示してみたいと思います。個々の自由にデータ分析をしているユーザーによる行動を記録することで、全体を見渡している誰かがヒアリングや調査をして情報を管理することなく、データ分析がどのように行われているかを管理・共有・再利用が可能となるのです。誰が、どのような目的で、どのデータを、どのように使用したのか、そしてその結果はどうだったのか？このアプリケーションの出した判定結果の説明をする必要がある。このモデルは誰が作ったのか？どのような学習データを使用したのか？どのようなモデリングプロセスだったのか？よく使用されるデータはどれか？　そのデータはどのように使用すれば良いのか？注意事項はなにか？データ分析に長けた人は誰か？誰が助けになってくれそうか？企業全体のデータ品質はどのようになっているか？　データ品質と利用パターンのバランスは適切か？誤った使い方をしているユーザーはいないか？など従来、社内勉強会を開催したり、詳しい人を探し出してノウハウを聞いたり、正しくないことも多い仕様書をひっくり返してみたり、そのようにして時間と労力をかけて得られていたデータ分析を自律的に行う際に重要となる社内ナレッジが、自己組織的に形成されるということです。「情報資産カタログ」とは～一般的な「データカタログ」との違いこのような世界を実現する機能をSASでは、「情報資産カタログ」と呼んでいます。データ分析プロセス全体を管理・検索・関連付け・レポートできるようにするテクノロジーです。一般的に言われる、また多くの失敗の原因になる、「データカタログ」と対比するとその大きな違いが見えてきます。こちらのブログでも述べましたが、データ分析者がセルフサービスでデータ分析を実践したり、初学者がなるべく自分自身で情報収集して、まずは標準的なデータ分析作業をマスターしたりするためには、既存ナレッジを活用する必要があります。一方で、そのようなナレッジは従来一部の優秀なデータ分析者に聞かないとわからなかったり、あるいはITシステム部門に質問して回答までに長い時間を要してビジネス機会を逸してしまう、という結果を招いていました。既存ナレッジとは、どのようなデータを、どのような意図で、どのような目的で、どのように使い、どのようなアウトプットを得たかという一連の「考え方とやり方」であり、これは管理者が一時的にデータ分析者にヒアリングして「データカタログ」を整備して終わり、というものではなく、日々データ分析者たちの中で自律的に情報が作られていくものです。ポイント①あらゆるデータ分析資産（ナレッジ）を管理 SAS Viyaでは、上述のアナリティクスライフサイクル各ステップのオブジェクトがすべて一元的に記録・管理されます。日々、新しく作られるレポート、データ加工プロセス、作成されるデータマートの情報が、自動的に管理され検索対象になっていきます。このようにアナリティクス・ライフサイクルの各ステップをすべて管理することで、データ、そのデータを使用しているレポート、そのデータを使用しているデータ加工フロー、その出力データ、さらにはそれを学習データとして使用している予測モデリングプロセスと作成されたモデル、これらを関連付けて見ることが可能となります。それにより例えば、ある目的に使用するデータを探している場合、参考にする業務名やプロジェクト名で検索をすることで、関連するレポートや、データ加工プロセスにたどり着き、そこから使用データやそのデータの使い方にたどり着くという効率的な情報の探し方が可能となります。もちろん、この機能は昔からあるインパクト・アナリシス機能として、ITシステム部門が、データへの変更の影響調査ツールとして使用することも可能です。ポイント②データ品質管理の自動化・省力化とガバナンスデータ分析を組織的に行う際に気にすべきポイントの一つは、その正確性です。正しいマスターデータを使用しているか、適切な品質のデータを使用しているかは、最終的なアクションや意思決定の精度すなわち収益に影響します。また、結果に対する説明責任を果たすうえでもアクションに使用したデータの品質は属人的ではなく、組織的に管理されている必要があります。またデータ品質を組織的に管理することにより、データ分析の最初に行っていた品質確認という作業が省力化できます。また、属人的に行っていた品質確認作業も標準化されるため、組織全体のデータ分析作業の品質が向上します。あるお客様では、DWHに格納するデータのETL処理において施すべき処理が実施されていないというミスがあるものの、データの数やETL処理があまりにも多いためそのミスを発見することが困難であるという状況にありました。網羅的な品質管理および品質レポートによってそのようなミスの発見が容易になります。ポイント③社内ソーシャルの力による自己組織的情報の蓄積前述のポイント①により基本的にはデータ分析者個人個人の自律的な活動が自動的に記録され、自己組織的に組織全体のナレッジとて蓄積され共有・再利用可能な状態が作られます。これは、データ分析者個人個人が特に意識しなくても自動的に実現できます。それに加えて、さらに意識的にこのプラットフォームを利用することで、蓄積されるナレッジに深みが増します。例えば、あるビジネス課題をデータ分析で解決使用する場合のスタートは、「問い」です。上述のアナリティクス・ライフサイクルの一番左のスタートにあるものです。その際には、仮説設定をするためや仮説を検証する目的で、様々な角度から「データ探索」を行います。この初期のデータ探索プロセスは、その後のデータ加工やモデリングの根拠になっているため、ナレッジとしてまた説明責任の材料としてはとても重要になります。必ずしも最終的に使用したデータと同じデータを使うとも限らないので、自動的には他のデータ分析資産とは関連づきません。そのような探索プロセスも下記の図のように、同じプロジェクトフォルダに保存しておくことで、関連オブジェクトとして活用することが可能となります。また、プロアクティブに自信が使用したデータやレポートにコメントや評価を付与することで、より価値の高いナレッジへと育つことになります。昨今企業内SNSなどで、オフィスツールの使い方などノウハウを共有をされている企業・組織もあるかと思います。それを全社規模のアナリティクス・プラットフォームで行うことで、データ分析に関わるナレッジをユーザー同士で培っていくイメージです。まとめ「このデータはこの目的に使えますか？」「あ、それはこの情報がないので使えないんですよ。こちらのデータを私は使ってますよ」データ分析者の間でよく交わされる会話です。この問いにいかに迅速に答えられるかが、データ分析の効率性と正確性を高めます。「情報資産カタログ」はまさにこの問いに答えるための機能なのです。

Read More

Data Visualization | Learn SAS

Rick WicklinDecember 6, 2023 0

10 tips for creating effective statistical graphics

These are a few of my favorite things. —Maria in The Sound of Music For my annual Christmas-themed post, I decided to forgo fractal Christmas trees and animated greeting cards and instead present a compilation of some of my favorite data visualization tips for advanced SAS users. Hopefully, this

Read More

Artificial Intelligence | Data Visualization | Machine Learning

Tom SaboDecember 4, 2023 0

Text analytics: A recipe for food safety success

As millions of people party and eat their way through the season of overindulgence, they should feel confident that indigestion and a few extra pounds should be the only downsides to their feasting. Thanks to countless hours of work behind the scenes by food inspectors and public health officials, diners

Read More

Analytics | Data Visualization | Programming Tips

Rick WicklinNovember 27, 2023 0

An example of finite-precision issues in a simple collinearity algorithm

The collinearity problem is to determine whether three points in the plane lie along a straight line. You can solve this problem by using middle-school algebra. An algebraic solution requires three steps. First, name the points: p, q, and r. Second, find the parametric equation for the line that passes

Read More

Data Visualization | Programming Tips

Rick WicklinNovember 20, 2023 0

Data visualization tip: Plot rates, not counts

Plot rates, not counts. This maxim is often stated by data visualization experts, but often ignored by practitioners. You might also hear the related phrases "plot proportions" or "plot percentages," which mean the same thing but expresses the idea alliteratively. An example in a previous article about avoiding alphabetical ordering

Read More

Data Visualization | Learn SAS

Rick WicklinNovember 13, 2023 0

Tip: Avoid alphabetical order for a categorical axis in a graph

Howard Wainer, who used to write the "Visual Revelations" column in Chance magazine, often reminded his readers that "we are almost never interested in seeing Alabama first" (2005, Graphic Discovery, p. 72). His comment is a reminder that when we plot data for a large number of categories (states, countries,

Read More

1 2 3 … 55 Next