- Conference Overview
- Conference Home
- Important Dates
- Registration & Rates
- Schedule Overview
- Conference Hotel
- Conference Content
- Section Descriptions
- Presentations
- The Quad
- Conference Extras
- Training Seminars
- SAS Certification
- For Presenters
- Call for Presentations
- Presenter Resources
- Scholarships
- Students
- Junior Professionals
- Get Involved
- Be A Volunteer
Proceedings
MWSUG 2024 Papers and Presentations
It's not too late to add YOUR presentation to this list. The Call for Presentations is open through September 13, so submit your idea right away!
Presentations by users are the heart of a SAS users group meeting. MWSUG 2024 will feature a variety of papers and presentations organized into several academic sections covering many different topics and experience levels.
Note: Content and schedule are subject to change. Last updated 31-Aug-2024.
- Analysis & Advanced Analytics
- Anything Data
- Basics and Beyond
- Business Leadership
- Hands-On Workshop
- Pharma & Healthcare
Analysis & Advanced Analytics
Paper No. | Author(s) | Paper Title (click for abstract) |
AL-027 | Ronald Fehd | Calculating Cardinality Ratio in Two Steps |
AL-035 | Jayanth Iyengar | The Everytown Research database: Using SAS® analytic procedures to analyze mass shootings |
AL-043 | Mark Pickering et al. | %SURVEYCANCORR Macro: Canonical Correlation for Complex Survey Data |
AL-046 | Scott Koval | Teaching Your Computer to See: Using Computer Vision to Detect Defects |
AL-051 | Brandy Sinco et al. | What if Proc Power Can't Help? Calculating Power via Simulation, Example with Count Regression |
AL-058 | Ryan Lafler & Anna Wade |
Developing Artificial and Convolutional Neural Networks with Python's Keras API for TensorFlow |
AL-059 | Ryan Lafler | Charting Your Organization's Machine Learning Roadmap |
AL-061 | LeRoy Bessler | Revelation and Exploration for COVID-19 Data: Visual Data Insights™ and an InfoGeographic Atlas |
AL-070 | Stephen Sloan | The Business Value of Diversity Combined with Data Science |
Anything Data
Paper No. | Author(s) | Paper Title (click for abstract) |
AD-004 | Kirk Lafler | Data Literacy 101: Understanding Data and Extracting Insights |
AD-023 | Ronald Fehd | Q&A with the macro maven: is sql our lingua franca? |
AD-048 | Jose Centeno | Using Sas To Manage And Maintain Your Data Repository |
AD-056 | Josh Horstman | Creating and Customizing High-Impact Excel Workbooks from SAS with ODS EXCEL |
AD-060 | LeRoy Bessler | Bessler's Principles of Communication-Effective Data Visualization |
AD-066 | Jake Reeser | Using SAS I/O Functions to Retrieve External Metadata |
Basics and Beyond
Business Leadership
Paper No. | Author(s) | Paper Title (click for abstract) |
BL-009 | Kirk Lafler | Soft Skills to Gain a Competitive Edge in the 21st Century Job Market |
BL-014 | David Corliss | Predicting the Impact of Storms on Utility Customers |
BL-031 | Jennifer Rosson | Automation with Custom SAS Tasks using SAS® Studio for Business Processes |
BL-036 | Jayanth Iyengar | SAS Job Searching and Interviewing tips – Strategies in the Post-Pandemic era |
BL-057 | Josh Horstman & Richann Watson |
Adventures in Independent Consulting: Perspectives from Two Veteran Consultants Living the Dream |
BL-065 | David Corliss | Being A Statistical Expert Witness |
Hands-On Workshop
Paper No. | Author(s) | Paper Title (click for abstract) |
HW-003 | Chuck Kincaid | Machine Learning in R |
HW-013 | Kirk Lafler | Dashboards Made Easy Using SAS® Software |
HW-039 | Richann Watson & Josh Horstman |
Complex Custom Clinical Graphs Step by Step with SAS® ODS Statistical Graphics |
HW-055 | Josh Horstman | Advanced Topics in ODS Graphics: Annotations, Attribute Maps, and Axis Tables |
HW-063 | Russ Lavery | An Animated Guide to the (largely) Undocumented Internals of PROC Report |
HW-064 | Russ Lavery | An Animated Guide to the Internals of GIT |
Pharma & Healthcare
Paper No. | Author(s) | Paper Title (click for abstract) |
PH-034 | Brooke Ellen Delgoffe | Paper Doesn't Listen |
PH-037 | Jayanth Iyengar | NHANES Dietary Supplement component: a parallel programming project |
PH-038 | Yanfang Pang | Harnessing the Power of PROC CONTENTS and SELECT INTO: SEPARATED BY |
Abstracts
Analysis & Advanced Analytics
AL-027 : Calculating Cardinality Ratio in Two StepsRonald Fehd, Fragile-Free Software
The cardinality of a set is the number of elements in the set. The cardinality of a SAS software data set is the number of observations of the data set, n-obs. The cardinality of a variable in a data set is the number of distinct values (levels) of the variable, n-levels. The cardinality ratio of a variable is n-levels / n-obs; the range of this value is from zero to one. Previous algorithms combined output data sets from the frequency and contents procedures in a data step. This algorithm reduces multiple frequency procedure steps to a single call, and uses scl functions to fetch contents information in the second data step. The output data set, a dimension table of the list of data set variable names, has variable cr-type whose values are in (few, many, unique, empty); this variable identifies the three main types of variables in a data set, few is discrete, many is continuous, unique is a row-identifier, and empty is a single value. The purpose of this paper is to provide a general-purpose program, ml-names-card-ratios.sas, which provides enhanced information about the variables in a data set. The author uses this list-processing program in Fast Data Review, Exploratory Data Analysis (EDA) and in Test-Driven Development (TDD), a discripline of Agile and Extreme Programming.
AL-035 : The Everytown Research database: Using SAS® analytic procedures to analyze mass shootings
Jayanth Iyengar, Data Systems Consultants LLC
With mass shootings occurring every week, It can accurately be stated that mass shootings in the U.S. have reached the level of an epidemic. Everytown Research and Policy conducts independent methodically rigorous research and supports evidence-based policies to reduce the incidence of gun violence. In 2009, Everytown Research started assembling a Mass Shooting database which records key data on every mass shooting in the U.S. In this paper, I examine and explore the database using SAS® procedures to produce a series of tables, reports, graphics and visualizations. The goal of this project is to generate insights from the SAS analytics that guide the building of effective programs and policies to reduce the epidemic of mass shootings.
AL-043 : %SURVEYCANCORR Macro: Canonical Correlation for Complex Survey Data
Mark Pickering, Indiana University-School of Public Health, Bloomington
Taylor Lewis, George Mason University-Department of Statistics
Raul Cruz, Indiana University-School of Public Health, Bloomington
Classic Canonical Correlation Analysis (CCA) is a valuable statistical tool for exploring associations between two sets of variables. However, existing SAS 9.4 procedures lack the capability to incorporate complex survey design (CSD) features, such as replicate weights, clusters, and strata, as recommended in the literature when implementing CCA on variables appearing on a complex survey data set. To address this gap, we introduce the %SURVEYCANCORR macro in this paper. We begin by outlining the theoretical underpinnings of our algorithm, discuss the functionalities of the corresponding macro, and demonstrate its implementation through example analyses using publicly available nationally representative survey datasets. Our findings underscore the critical importance of accounting for CSD features in determining the statistical significance of canonical correlations.
AL-046 : Teaching Your Computer to See: Using Computer Vision to Detect Defects
Scott Koval, Pinnacle Solutions, Inc
A classic problem in manufacturing is the manual inspection of items produced. By using SAS Viya and Machine Learning, a neural network can be trained in order to assist in the inspection process. This presentation will demonstrate how computer vision can be applied to automatically assess if a casting unit produced is considered defective or not using a picture of the product.
AL-051 : What if Proc Power Can't Help? Calculating Power via Simulation, Example with Count Regression
Brandy Sinco, University of Michigan
Joceline Vu, Temple University
Pasithorn Suwanabol, University of Michigan
Background: Reviewers for an article comparing opioid use after surgery questioned the amount of power for a Zero-Inflated Negative Binomial (ZINB) model. Power was estimated by simulation because Proc Power does not have a ZINB option. Objectives. For a given dataset, show how to use Procs GenMod or CountReg to estimate the distribution parameters for Poisson, Negative Binomial, ZIP, and ZINB models. Then, use the streaminit() and RAND() functions to simulate distributions in a data stop. Run the outcomes model on the simulated data and compute the power. Methods. Begin with one simulation. After calculating the ZINB parameters with Proc GenMod, compare the simulated distribution to the actual data with graphics and descriptive statistic from Procs Freq and Univariate. Next, simulate 1,000 datasets. Put an index on _REPLICATION_ to maximize execution speed. Run the outcomes procedure 1,000 times and compute the simulation-based power as the percentage of times when the outcome is significant. I.E., compute power based on the definition of the probability of observing a significant difference between treatment and control, given that a significant difference is present in the data. To calculate power for the model with covariates, use bootstrapping. Compare the actual data to the simulated data with Procs Freq, Means, T-Test, and NPar1Way. Results. After outputting the results of Proc GenMod by _REPLICATION_ into an ODS dataset, the original dataset with 562 patients had 78% power to detect a significant difference in opioid use between surgical procedures without adjustment for additional covariates.
AL-058 : Developing Artificial and Convolutional Neural Networks with Python's Keras API for TensorFlow
Ryan Lafler, Premier Analytics Consulting, LLC
Anna Wade, Premier Analytics Consulting, LLC
Capable of accepting and mapping complex relationships hidden within structured and unstructured data, neural networks are built from layers of neurons and activation functions that interact, preserve, and exchange information between layers to develop highly flexible and robust predictive models. Neural networks are versatile in their applications to real-world problems; capable of regression, classification, and generating entirely new data from existing data sources, neural networks are accelerating recent breakthroughs in Deep Learning methodologies. Given the recent advancements in graphical processing unit (GPU) cards, cloud computing, and the availability of interpretable APIs like the Keras interface for TensorFlow, neural networks are rapidly moving from development to deployment in industries ranging from finance, healthcare, climatology, video streaming, business analytics, and marketing given their versatility in modeling complex problems using structured, semi-structured, and unstructured data. This paper explores fundamental concepts associated with neural networks including their inner workings, their differences from traditional machine learning algorithms, and their capabilities in supervised, unsupervised, and generative AI workflows. It also serves as an intuitive, example-oriented guide for developing Artificial Neural Network (ANN) and Convolutional Neural Network (CNN) architectures using Python's Keras and TensorFlow libraries for regression and image classification tasks.
AL-059 : Charting Your Organization's Machine Learning Roadmap
Ryan Lafler, Premier Analytics Consulting, LLC
Machine learning is experiencing a golden age of investment, democratization, and accessibility across all domains in the life sciences, natural sciences, and social sciences with applications to industry for business decision-making, risk management, consumer marketing, clinical trials, financial forecasting, security recognition, video remastering, digital twin simulations, and more. But what exactly is machine learning (ML)? How is it connected to Artificial Intelligence (AI)? And most importantly, how can data scientists, programmers, software engineers, and/or researchers start their endeavors into machine learning? This paper answers these questions, and more, by providing a roadmap to help navigate the complexities of machine learning in an application-oriented guide. This paper covers the main aspects of machine learning including supervised, unsupervised, and semi-supervised approaches as well as deep learning. The roadmap for supervised machine learning starts with linear regression and progressively builds towards more complex and flexible algorithms with discussions about the advantages and disadvantages of using certain models over others. This paper discusses the real-world applications of both labeled and unlabeled data; supervised and unsupervised machine learning algorithms; overfitting and underfitting; cross-validation; and the importance of hyperparameter tuning to better fit algorithms to their data.
AL-061 : Revelation and Exploration for COVID-19 Data: Visual Data Insights™ and an InfoGeographic Atlas
LeRoy Bessler, Bessler Consulting and Research
A COVID-19 Tableau dashboard posted in the Data Visualization Group at LinkedIn was so underinforming and visually disappointing that I decided to see what could be done better with the SuperPower tool of SAS® ODS Graphics and by applying my (software-independent) principles of communication-effective use of graphics tools and color.
AL-070 : The Business Value of Diversity Combined with Data Science
Stephen Sloan, Dawson D R
Having strong diversity programs, being sensitive to diversity issues, and being able to apply rigorous data science techniques can provide significant value to an organization. Many business problems and challenges begin with the statement of a problem, and the problem sometimes seems to relate to issues with diversity. Many organizations are aware of both the importance of diversity and the fact that problems sometimes manifest themselves as diversity issues, even when they have other causes. At other times there are issues where awareness of the value of diversity can help an organization achieve its goals. In addition, there are areas where diversity can have a direct impact on the organization in terms of compliance and scientific value. In this paper I cite examples where diversity has proven to have business value for an organization.
Anything Data
AD-004 : Data Literacy 101: Understanding Data and Extracting InsightsKirk Lafler, SasNerd
Data is all around us and growing at extraordinary rates, so anyone working with, or planning to work with, data should acquire a solid foundation in this essential skill. In this presentation, join Kirk Paul Lafler as he focuses on the fundamentals of data literacy, how to derive insights from data, and how data can help with decision-making. This presentation uses SAS® software to explore data with graphs and describes data with statistics to help anyone improve their understanding of data and to enable more informed data-driven decisions. Attendees learn about the different types of data - nominal, ordinal, interval, and ratio; how to assess the quality of data; explore data with scatterplots, line charts, and box plots; and use selected statistical methods to describe data. By the end of this presentation, you'll have a much better understanding of why data literacy is so important in the 21st century, as well as learning how to identify the correct types of data for you to use.
AD-023 : Q&A with the macro maven: is sql our lingua franca?
Ronald Fehd, Fragile-Free Software
SAS software provides an implementation of Structured Query Language (sql). Usage of the sql procedure creates an opportunity for the user to learn database terminology and consider how to address their programming problems and reporting as a matter of the design of the data structure. The purpose of this Q\&A dialogue is to review the suitability of sql as a common language for users and programmers. Its benefits include a knowledge of design of data structure for ease of reporting as well as the ability to research the SAS global symbol table. The dialogue is an exposition of how we learn both natural and artificial languages, and how we use them to communicate.
AD-048 : Using Sas To Manage And Maintain Your Data Repository
Jose Centeno, NORC at the University of Chicago
In real life applications, data is typically collected and stored in different systems and with different structures. End-users might not have access to the backend data source(s), or it can become confusing to keep track of these data locations. Building a centralized data repository helps your project bringing data together for more effective analysis. This paper explores how base SAS can be used by applying dynamic techniques to perform tasks for managing and maintaining your SQL-based data repository. This paper assumes a basic understanding of SAS macro programming and basic understanding of database concepts.
AD-056 : Creating and Customizing High-Impact Excel Workbooks from SAS with ODS EXCEL
Josh Horstman, Nested Loop Consulting
Love it or hate it, Microsoft Excel is used extensively throughout the business world. As a SAS user, you can enhance the impact of your work by using the ODS EXCEL destination to create high-quality, customized output in Excel format directly from SAS. This paper walks through a series of examples demonstrating the flexibility and power of this approach. In addition to complete control over visual attributes such as fonts, colors, and borders, the ODS EXCEL destination allows the SAS user to take advantage of Excel features such as multiple tabs, frozen or hidden rows and columns, and even Excel formulas to deliver the high-impact results you and your customers want!
AD-060 : Bessler's Principles of Communication-Effective Data Visualization
LeRoy Bessler, Bessler Consulting and Research
Let me show you how to best get beyond your graphics software defaults. My list of principles is a L O N G list, but this paper and presentation is a short list. The long list is from 44 years as a data artist working to get the best out of SAS® graphics software. (In the words of Alan Bates, "Just think of it as a hobby that has gotten out of hand.") Come to see a subset story, demonstrated with widely applicable data graphics examples that you, too, can use. All, but a dangerous but popular archaism, are rendered with SAS ODS Graphics, The Graphics SuperPower Tool. If you have SAS software, you already have SAS ODS Graphics, at no added expense, but you can, and dare I say SHOULD, use my principles for graphic design and color use with ANY software. (This topic features principles from the book Visual Data Insights Using SAS ODS Graphics: A Guide to Communication-Effective Data Visualization.)
AD-066 : Using SAS I/O Functions to Retrieve External Metadata
Jake Reeser, NORC
In my paper, I would discuss external file functions, or I/O functions. I would go through what information is available, as well as how to obtain Metadata on specific files. I will also go into how to gain info on external directories and subdirectories. I will lastly show how to use this file metadata to allow for easier data processing. This paper would be for those of an intermediate skill level. Knowledge of the SAS macro language would also be required.
Basics and Beyond
BB-011 : Application of Fuzzy Matching Programming Techniques Using SAS® SoftwareKirk Lafler, SasNerd
Stephen Sloan, Dawson DR (Data Research)
Data comes in all forms, shapes, sizes and complexities. Stored in files and datasets, SAS® users across industries recognize that data can be, and often is, problematic and plagued with a variety of issues. Data files can be joined without problem when each file contains identifiers, or "keys", with unique values. However, many files do not have unique identifiers and need to be joined by character values, like names or E-mail addresses. These identifiers might be spelled differently, or use different abbreviation or capitalization protocols. This paper illustrates datasets containing a sampling of data issues, popular data cleaning and user-defined validation techniques, data transformation techniques, traditional merge and join techniques, the introduction to the application of different SAS character-handling functions for phonetic matching, including SOUNDEX, SPEDIS, COMPLEV, and COMPGED, and an assortment of SAS programming techniques to resolve key identifier issues and to successfully merge, join and match less than perfect, or "messy" data. Although the programming techniques are illustrated using SAS code, many, if not most, of the techniques can be applied to any software platform that supports character-handling.
BB-016 : What's Your Favorite Color? Controlling the Appearance of a Graph
Richann Watson, DataRich Consulting
The appearance of a graph produced by the Graph Template Language (GTL) is controlled by Output Delivery System (ODS) style elements. These elements include fonts and line and marker properties as well as colors. A number of procedures, including the Statistical Graphics (SG) procedures, produce graphics using a specific ODS style template. This paper provides a very basic background of the different style templates and the elements associated with the style templates. However, sometimes the default style associated with a particular destination does not produce the desired appearance. Instead of using the default style, you can control which style is used by indicating the desired style on the ODS destination statement. However, sometimes not a single one of the 50-plus styles provided by SAS® achieves the desired look. Luckily, you can modify an ODS style template to meet your own needs. One such style modification is to control which colors are used in the graph. Different approaches to modifying a style template to specify colors used are discussed in depth in this paper.
BB-017 : Just Stringing Along: FIND Your Way to Great User-Defined Functions
Richann Watson, DataRich Consulting
Louise Hadden, Abt Global Inc.
SAS® provides a vast number of functions and subroutines (sometimes referred to as CALL routines). These useful scripts are an integral part of the programmer's toolbox, regardless of the programming language. Sometimes, however, pre-written functions are not a perfect match for what needs to be done, or for the platform that required work is being performed upon. Luckily, SAS has provided a solution in the form of the FCMP procedure, which allows SAS practitioners to design and execute User-Defined Functions (UDFs). This paper presents two case studies for which the character or string functions SAS provides were insufficient for work requirements and goals and demonstrate the design process for custom functions and how to achieve the desired results.
BB-018 : An Animated Guide to Proc Report Internals
Russ Lavery, Contractor
PROC REPORT is an exciting "big file technique" that every programmer should know. PROC REPORT allows the creation of complicated reports, with many levels of summarization, while only reading a large source file one time. PROC REPORT allows -in just one step- summarization to a desired level, calculation of new variables and the appending of different kinds reports into one complex report. ALL this happens in one read of the input data. Especially interesting is that the internal file for PROC REPORT, a file that holds "the combined multiple report" can be sent to a SAS® data set and used as a input data. This paper will attempt to show the time sequence of the internal actions of PROC REPORT. Knowing the time sequence of actions, especially calculations, is crucial to doing complicated PROC REPORTs.
BB-019 : From Muggles to Macros: Transfiguring Your SAS® Programs With Dynamic, Data-Driven Wizardry
Josh Horstman, Nested Loop Consulting
Richann Watson, DataRich Consulting
The SAS macro facility is an amazing tool for creating dynamic, flexible, reusable programs that automatically adapt to change. This presentation uses examples to demonstrate how to transform static "muggle" code full of hardcodes and data dependencies by adding macro language magic to create data-driven programming logic. Cast a vanishing spell on data dependencies and let the macro facility write your SAS code for you!
BB-022 : List Processing using SQL Select Into to Replace Call Symputx Creating Indexed Arrays of Macro Variables
Ronald Fehd, Fragile-Free Software
SAS software provides the ability to allocate a macro variable in a data step with its call symputx routine. This routine can be used to make a sequentially-numbered series of macro variables --- mvar1--mvarN --- which is refered to as an indexed array of macro variables. The purpose of this paper is to examine the algorithm of macro variable array usage and provide sql consolidations of the various tasks.
BB-030 : %SEND_MSOFFICE_EMAIL - Pretty, Powerful, and Painless
Brooke Ellen Delgoffe, Marshfield Clinic Research Institute
While ODS is well known for creating stylized reports (PDF, RTF, Word) or other outputs (Excel, CSV), it can also be used to create stylized emails. This paper will explore the use of ODS MSOFFICE2K as an output destination for use in email. The %SEND_MSOFFICE_EMAIL macro will be presented and used to demonstrate standardization methods, applying styles, and inclusion of tables, graphs, and text. The macro uses PROC ODSTEXT in place of a DATA step for text. It also uses PROC SQL / %LET statements to create macro variables to house counts and text for feature in the body of the email. An appendix will hold the full code and a link will be provided to a resource for download of the most recent versions, should any occur after publication. Brief Outline: • Introduction • Email using FILENAME statement with EMAIL option • Dynamic Recipients List and Text using %IF / %THEN / %ELSE • The ODS MSOFFICE Basics • Setting the ESCAPECHAR and Style Attributes • Featuring Tables, Graphs, and Text • Suggested Use: Standardization • Suggested Use: Reporting on Scheduled Jobs • %SEND_MSOFFICE_EMAIL full code
BB-032 : SAS® PROC GEOCODE By Example: A Case Study
Louise Hadden, Abt Global Inc.
Numerous international and domestic governments provide free public access to downloadable databases containing health data. One example is the Centers for Medicare and Medicaid Services' Care Compare data, which contains address information for providers. This paper and presentation will describe the process of downloading data and creating an analytic data base; running SAS®'s PROC GEOCODE (part of Base SAS®) using Tiger street address level data to obtain latitude and longitude at a finer level than zip code; modeling the data points using SAS/Stat, and finally using PROC SGMAP (part of Base SAS®) with annotation to create a visualization of a proximity analysis. Data processing was accomplished using BASE SAS® ( SAS 9.4 M7). This presentation is suitable for all skill levels.
BB-033 : Looking for the Missing(ness) Piece
Louise Hadden, Abt Global Inc.
Reporting on missing and/or non-response data is of paramount importance when working with longitudinal surveillance, laboratory, and medical record data. For a CDC surveillance project with thousands of variables and weekly deliveries, an efficient and comprehensive assessment of missing values was required. PROC FREQ with the NLEVELS option, PROC REPORT and traffic-lighting, and PROC UNIVARIATE OUTTABLES can produce an effective, data-driven visualization. Data processing was performed using SAS 9.4 M7. This presentation is suitable for all skill levels.
BB-040 : An Introduction to Obtaining Test Statistics and P-Values from SAS® and R
Brian Varney, Experis
Getting values of test statistics and p-values out of SAS and R is quite easy in each of the software packages but also quite different from each other. This paper intends to compare the SAS and R methods for obtaining these values from tests involving Chi-Square and Analysis of Variance such that they can be leveraged in tables, listings, and figures. This paper will include but not be limited to the following topics: • SAS ODS trace • SAS PROC FREQ • PROC GLM • R stats::chisq.test() function • R stats::aov() function • R broom package functions The audience for this paper is intended to be programmers familiar with SAS and R but not at an advanced level.
BB-041 : Inventory your OS for Programming Information
Brian Varney, Experis
Whether you are attempting to figure out what you have when preparing for a migration or you just want to find out which files or directories are taking up all of your space, SAS is a great tool to inventory and report on the files on your desktop or server. This paper intends to present SAS code to inventory and report on the location you want to inventory.
BB-042 : Bring Your SAS and Python Worlds Together With SASPy!
Ted Conway, Self
What if you could combine the powers of SAS and Python? That's the idea behind SASPy, the open-source Python package that enables Python coders to access SAS data and analytics capabilities. In this session, we'll see how SASPy can be used to integrate SAS and Python – even across platforms – bringing you the best of both worlds! SASPy can be used in a variety of SAS, Viya and Python deployments. For this session, a laptop-based Microsoft Visual Studio notebook running Python will be used together with Base SAS 9.4 via the Cloud-based SAS® OnDemand for Academics to create an interactive Sunburst chart visualization of an NBA salary SAS dataset.
BB-044 : How to build custom macros in SAS: A step-by-step approach to translating your code into a custom macro
Sara Richter, Richter Statistical Services, LLC
Do you find yourself copying/pasting code over and over and over again and thinking you should find a more efficient way? Intrigued by macros but overwhelmed at where to start? Not sure how to make the leap from copying/pasting to building a robust custom macro? Then this session is for you! In this presentation, we will discuss strategies for turning "regular code" into a custom macro, present troubleshooting tips for common macro errors, and walk step-by-step through several examples of building custom macros, from simple summary statistic macros to complex reporting macros. Attendees will be able to translate the strategies presented to their own SAS projects. This session is geared towards intermediate SAS programmers who have been exposed to macro variables and macros in SASv9.4.
BB-045 : Loading a bunch of Excel workbooks, with each a number of sheets, into SAS
Erik Tilanus, Synchrona
I received a collection of Excel workbooks from a client, covering data of a whole year. The collection was neatly organized in folders per month and within each folder several workbooks were present. Each workbook consisted of 3 to 6 worksheets. There was a total 50 workbooks, together containing 222 sheets. The presentation will explain how I managed to read all the workbooks and sheets, even though there was a variation of names in the sheets. This includes navigating through the monthly folders, getting the sheet names from the sheets and import their content. The program is all Base SAS 9.4, using some simple macro's, CALL EXECUTE and a technique to look into the structure of an Excel workbook to obtain the sheet names. The target audience is beginners to intermediate level SAS developers.
BB-047 : Existence is Fruitful: How to determine whether something exists in your SAS session
Erin O'Dea, NORC at the University of Chicago
This paper will include descriptions and code representing several options for checking whether various things in SAS exist: macro variables, macros, tables, libnames, and columns. SAS has some built in functions that look for macro variables and macros, as well as dsid properties and dictionary tables that can locate tables, libraries, and column names. In this paper we will explore these options and how they can be included to write more reusable code. Maybe you forgot an include statement that contains macros, or you copied over a library path incorrectly. Maybe you thought a dataset had a certain column and are surprised to find that it does not. Maybe you are unsure of what data might come in from an outside source, or your data structure changes frequently. Maybe you're a little bit confused about scope, or you didn't realize your SAS session reset. When you write a SAS program that assumes existence of certain entities, often SAS will either error out and end execution, or you will be left with a trail of warnings and errors that you need to track down at the root. If your program can detect ahead of time if something is missing or work around it, you can save a significant amount of time on resolving that missingness. This is targeted to SAS users with a basic understanding of programming but will be more of a useful description of options and code that will not require significant experience to use.
BB-050 : Building a SAS Data Viz Toolkit
Sara Richter, Richter Statistical Services, LLC
Graphs and tables can be critical tools in conveying analysis results. When they're done well, they're eye-catching and informative. When they're not done well, they can distract, bore, or even confuse your audience. Applying simple data visualization best practices to graphs and tables can turn mediocre data displays to high-impact communication tools. And, best of all, this can be done using features available in SAS/GRAPH. This presentation will provide participants with ideas for creating high impact graphs and tables for a wide variety of analysis findings, including displaying single numbers to group comparisons to logistic regression output. During the presentation, we will review data visualization best practices and demonstrate the effectiveness of these practices through before-and-after examples. We will highlight SAS tools that can be used alone or in combination to achieve these best practices, including formatting, the ODS ESCAPE CHAR, SGANNOTATE macros, and combining PROC REPORT and SGPLOT outputs into a single visual element. These code examples will hopefully inspire analysts at all SAS levels to create more advanced data visualization elements.
BB-052 : Using SAS® to Consume Data from APIs
Jack Shoemaker, Medical Home Network
Somaiya Shakil, Medical Home Network
The world is moving to the cloud and that means using APIs to consume and publish data. At the simplest level, an API is just a web address that you could use a command-line tool like cURL to access. What comes back is often a JSON data payload. SAS® provides several tools which make access easier and more robust. Using the public CMS Beneficiary Claims Data API (BCDA), this paper will show how to use a combination of command-line tools, PROC HTTP, and PROC JSON to communicate with the CMS API. The BCDA returns data in newline-delimited JSON (NDJSON) which presents a few more wrinkles to iron out. NDJSON files contain one or more JSON payloads. PROC JSON operates on a single payload, so needs some help to consume NDJSON-formatted data. This paper will present two solutions to this challenge as well as a complete package that authenticates and consumes the CMS data.
BB-054 : Operator, Please? Making the Most of SAS Language Operators
Josh Horstman, Nested Loop Consulting
While operators are some of the shortest elements in the SAS language, they are long on functionality. This paper surveys several useful SAS operators, + a few you might not even know about like <>, ><, and =:.
BB-067 : Best Practices for Efficiency and Code Optimization in SAS® programming
Jayanth Iyengar, Data Systems Consultants LLC
There are multiple ways to measure efficiency in SAS® programming; programmers' time, processing or execution time, memory, input/output (I/O) and storage space considerations. As data sets are growing larger in size, efficiency techniques play a larger and larger role in the programmers' toolkit. This need has been compounded further by the need to access and process data stored in the cloud, and due to the pandemic as programmers find themselves working remotely in distributed teams. As a criteria to evaluate code, efficiency has become as important as producing a clean log, or expected output. This paper explores best practices in efficiency from a processing standpoint, as well as others.
BB-068 : Bessler's Recommendations for Using Color in Web Pages, Graphs, Tables, Maps, Text, and Print
LeRoy Bessler, Bessler Consulting and Research
Let me be your Color Therapist. I have been counseling users of color with SAS graphics software since 1995. I like to title this topic "Principia Color". This is a tutorial about What To, not any particular software's technical How To or coding syntax. Other authors/speakers have covered that: Perry Watts most extensively (e.g., for SAS/GRAPH), and at this conference by Richann Watson for SAS Graph Template Language. I prefer and use SAS ODS Graphics, but this presentation is software-neutral. My latest comprehensive coverage of wise use of color is in Chapter 2 of "Visual Data Insights Using SAS ODS Graphics: A Guide to Communication-Effective Data Visualization". Here at MWSUG 2024, I am available to you for a twenty-minute session at no additional charge. No SAS Software Usage Insurance coverage is required.
BB-069 : Create Report with SAS Visual Analytics to Finish Your SAS Story
Shinya Kodama, NORC
You spend long hours trying develop SAS programs and create datasets as deliverables. How do you explain the results to your customers, peers and stakeholders, especially when your audience are non- technical? SAS Visual Analytics is very effective tool to create presentations to reflect your SAS data just created with graphics to tell the story you want to highlight. This presentation will be for users of all levels.
BB-071 : The CALL EXECUTE execute statement: theory and application
Erik Tilanus, Synchrona
The standard SAS documentation on the CALL EXECUTE routine is short: "Resolves the argument, and issues the resolved value for execution at the next step boundary." In this paper we examine the working of the CALL EXECUTE routine, address some timing issues and show a few examples.
Business Leadership
BL-009 : Soft Skills to Gain a Competitive Edge in the 21st Century Job MarketKirk Lafler, SasNerd
The 21st-century economy is converging with existing technologies like proprietary and open-source software, the Internet, robotics, and the Cloud; with emerging technologies like artificial intelligence (AI), machine learning (ML), IoT, nanotechnology, and biotechnology. And to add to all this is the collection, curation, and processing of vast quantities of data and information – all converging at breakneck speeds resulting in the blurring of boundaries between humans and machines. The 21st century job market is experiencing a technological revolution that is changing faster than ever, and organizations, along with today's workforce, must learn how to cope and adapt.
BL-014 : Predicting the Impact of Storms on Utility Customers
David Corliss, Peace-Work
In an era marked by escalating storm severity and aging utility infrastructure, the ability to predict and mitigate the impact of weather-related events on utility services is more critical than ever. DTE Energy's Weather Analytics Model, an evolving application employing a suite of models to predict the impact of storm/weather activity on utilitycustomers and infrastructure. This model uses National Oceanic and Atmospheric Administration (NOAA) data to identify patterns in historical weather data to make predictions about current storm outage scenarios. This presentation discuss how DTE uses this application for actionable outcomes to minimize power disruption and optimize resource allocation during storm events. The presentation also describes how this approach can be applied to the prediction of other types of business risks.
BL-031 : Automation with Custom SAS Tasks using SAS® Studio for Business Processes
Jennifer Rosson, Western Alliance Bank
SAS Studio is a web-based tool that allows users to run SAS code through the web browser. It also offers several predefined tasks, which allow for point-and-click user interfaces for several analytical procedures. A big surprise to many SAS programmers is that SAS Studio offers the capability to create custom tasks with very little effort. These custom tasks can become powerful tools of automation for many audited processes. Custom tasks can be created to provide a user interface requiring entry of a few parameters that can set off a series of SAS programs fully automated to run, review log files for errors, check data quality, send emails, and publish reports. Custom tasks to automate SAS programs can be created to meet the most stringent data control requirements, including SOX (Sarbanes-Oxley), MRMG (Model Risk Management Governance) and FRB (Federal Reserve Board) audit reviews imitating a production environment for publication of financial reports such as CECL (Current Expected Credit Losses), DFAST (Dodd-Frank Act Stress Test), and CCAR (Comprehensive Capital Analysis and Review). With the addition of some well-known Base SAS features to read directories, capture logs and send emails, a custom task can set off a fully functional and controlled production environment with less effort than most imagined.
BL-036 : SAS Job Searching and Interviewing tips – Strategies in the Post-Pandemic era
Jayanth Iyengar, Data Systems Consultants LLC
Searching for work in the data analytics market is more competitive than ever before. Part of this is due to the nature of the work environment, which has shifted from in office, on-site to remote and hybrid positions. Also, because of the emergence and growth of open-source tools, such as R and Python, candidates for SAS positions need to be familiar with these coding tools. In this paper, I'll outline tips and strategies for success in the application and interview process in the post-pandemic era. I'll also discuss the interview process in detail, from initial HR interviews to final interviews with a hiring manager.
BL-057 : Adventures in Independent Consulting: Perspectives from Two Veteran Consultants Living the Dream
Josh Horstman, Nested Loop Consulting
Richann Watson, DataRich Consulting
While many statisticians and programmers are content in a traditional employment setting, others yearn for the freedom and flexibility that come with being an independent consultant. In this paper, two seasoned consultants share their experiences going independent. Topics include the advantages and disadvantages of independent consulting, getting started, finding work, operating your business, and what it takes to succeed. Whether you're thinking of declaring your own independence or just interested in hearing stories from the trenches, you're sure to gain a new perspective on this exciting adventure.
BL-065 : Being A Statistical Expert Witness
David Corliss, Peace-Work
An expert in statistics can be asked to consult in several legal situations, including analysis and commentary on proposed legisation, expert analysis in legal cases, and coutroom testimony. This talk presents experiences as a statistical expert witness and the contexts where expert testimony is used, along with a brief discussion of requirements, professional background, preparation, and training on being an expert witness available from the American Statistical Association.
Hands-On Workshop
HW-003 : Machine Learning in RChuck Kincaid, Experis Business Analytics
Are you eager to dive into the fascinating world of machine learning and leverage the power of R to unlock its true potential? This Hands on Workshop will equip you with key machine learning techniques and navigate the best R packages to tackle real-world challenges. We will use examples to build machine learning models, exploring the when, the why and the how. The workshop will be accessible to those with all levels of machine learning expertise. Basic familiarity with R would be very beneficial, though may not be mandatory.
HW-013 : Dashboards Made Easy Using SAS® Software
Kirk Lafler, SasNerd
Organizations around the world develop business intelligence and analytics dashboards, sometimes referred to as enterprise dashboards, to display the status of "point-in-time" metrics and key performance indicators. Effectively designed dashboards extract real-time data from multiple sources for the purpose of highlighting important information, numbers, tables, statistics, metrics, performance scorecards and other essential content. This Hands-On Workshop (HOW) and paper provides attendees with essential rules for "good" dashboard design, the metrics frequently used in dashboards, and the use of best practice programming techniques in the design of quick and easy dashboards using SAS® software. Learn essential programming techniques to create real-world dashboards using Base-SAS® software including PROC SQL, macro, Output Delivery System (ODS), ODS HTML, ODS Excel, ODS Layout, ODS Statistical Graphics, PROC SGPLOT, PROC SGPIE, and other techniques.
HW-039 : Complex Custom Clinical Graphs Step by Step with SAS® ODS Statistical Graphics
Richann Watson, DataRich Consulting
Josh Horstman, Nested Loop Consulting
The ODS Statistical Graphics package is a powerful tool for creating the complex, highly customized graphs often produced when reporting clinical trial results. These tools include the ODS Statistical Graphics procedures, such as the SGPLOT procedure, as well as the Graph Template Language (GTL). The SG procedures give the programmer a convenient procedural interface to much of the functionality of ODS Statistical Graphics, while GTL provides unparalleled flexibility to customize nearly any graph that one can imagine. In this hands-on workshop, we step through a series of increasingly sophisticated examples demonstrating how to use these tools to build clinical graphics by starting with a basic plot and adding additional elements until the desired result is achieved.
HW-055 : Advanced Topics in ODS Graphics: Annotations, Attribute Maps, and Axis Tables
Josh Horstman, Nested Loop Consulting
The ODS Statistical Graphics package included in Base SAS provides a robust and flexible set of of tools for producing professional-looking graphics. This workshop will introduce three advanced tools that extend the functionality of the statistical graphics procedures: annotations, attribute maps, and axis tables. Through a series of hands-on exercises, discover how you can take advantage of these features to create impressive and highly customized graphs.
HW-063 : An Animated Guide to the (largely) Undocumented Internals of PROC Report
Russ Lavery, Contractor
PROC Report is a legitimate big file technique. It reads the source file, from top to bottom, one time and creates an internal working file that it uses for all its calculations because of this structure it is very fast and can produce a report in little more than the time it takes to read the source file. PROC Report has a fairly complicated internal structure and the contents of that internal structure change overtime. It is difficulty to, using words, static PowerPoint slides or code examples in an editor, to communicate the dynamic nature of PROC Report. This cartoon like treatment of the internals of Proc Report shows the different states of the system and makes this complicated topic easy to understand. This talk, when Art carpenters PROC Report book was being sold in hardcover, was included on a CD taped to the back cover of the book. It has not been seen in public for several years. Attendees will get the SAS code for all the examples used in the HOW.
HW-064 : An Animated Guide to the Internals of GIT
Russ Lavery, Contractor
Competency in Git and Git-Hub are skills that employers are demanding. Some employers, as part of the hiring process, are checking the Git-Hub repositories of candidates. This HOW, due to time constraints, will do a deep dive into some of GIT but will be light on the topic of merge conflicts and will not cover any Git-HUB.. GIT has a complicated internal structure and the content of that internal structure changes over time. This animated, cartoon like, treatment shows the changing internals of GIT and allows an attendee to match the "reports" returned by Git commands (checkout, diff, diff –staged, HEAD, etc) to the Git internals. This avoids the common problem of needing to define terms using undefined terms that is common on YouTube. This HOW SHOWS what Git is doing and asks you to explain the action in your own words.
Pharma & Healthcare
PH-034 : Paper Doesn't ListenBrooke Ellen Delgoffe, Marshfield Clinic Research Institute
Before there was electronic collection instruments there was PAPER. While any paper survey can surely be converted to an electronic format, doing so with the data is often turbulent. Unlike our beloved REDCap® surveys that have branching logic, data quality rules, drop-downs, error correcting pop-ups and required fields – paper doesn't have to listen to rules! This paper will shed light on methods for building tools, understanding missing data, and cleaning-up data coming from paper records that came out of a Marshfield Clinic Health System initiative to abstract information from paper records and provide guidance on what to expect from the process. Brief Outline: • Introduction • Cause and Effect: Understanding what happens when paper doesn't listen o No required fields = Missingness o No branching logic = Erroneous entry o No data quality = invalid/outlier values o No drop-downs = unwelcomed "write-in" values o No field tips = inconsistent use of values/options • Considerations for and Pitfalls of Tools collecting paper-based data • Methods for Managing Paper-based data o Start Basic: Check your assumptions o Look Further: Distinct Values o Act Appropriate: Deciding between programmatic or manual rectification
PH-037 : NHANES Dietary Supplement component: a parallel programming project
Jayanth Iyengar, Data Systems Consultants LLC
The National Health and Nutrition Examination Survey (NHANES) contains many sections and components which report on and assess the nation's health status. A team of IT specialists and computer systems analysts handle data processing, quality control, and quality assurance for the survey. The most complex section of NHANES is dietary supplements, from which five publicly released data sets are derived. Because of its complexity, the Dietary Supplements section is assigned to two SAS programmers who are responsible for completing the project independently. This paper reviews the process for producing the Dietary Supplements section of NHANES, a parallel programming project, conducted by the National Center for Health Statistics, a center of the Centers for Disease Control (CDC).
PH-038 : Harnessing the Power of PROC CONTENTS and SELECT INTO: SEPARATED BY
Yanfang Pang, AbbVie
Statistical programmers working on clinical trials often need to create SDTM (Study Data Tabulation Model) and ADaM (Analysis Dataset Model) datasets. These datasets are critical for analyzing and interpreting the trial results. In some cases, the programmers derive these datasets from existing ones by extracting, transforming, or modifying attributes of the data. However, doing this manually can be time-consuming. The SAS PROC CONTENTS procedure provides metadata about datasets such as variable names, labels, and data types. The SQL SELECT INTO: SEPARATED BY statement can store variables of interest into a macro variable. By combining these two techniques, programmers can efficiently extract and transform data as well as generate new datasets. This paper shows how to use this approach to create the EX (Exposure) dataset from the EC (Exposure as Collected) dataset by replacing "EC" with "EX" in the variable names and labels. It also demonstrates creating the ADDS (Disposition Analysis Dataset) dataset from the DS (Disposition) and SUPPDS (DS Supplemental Qualifiers) datasets, improving efficiency. Additional benefits highlighted in the paper include avoiding listing all related variables, increased speed, and reduced errors.