Presentations by users are the heart of a SAS users group meeting. MWSUG 2025 will feature a variety of papers and presentations organized into several academic sections covering many different topics and experience levels.

Note: Content and schedule are subject to change. Last updated 25-Aug-2025.



AI and Emerging Technology

Paper No. Author(s) Paper Title (click for abstract)
AE-001 David Corliss Designing Against Bias: Identifying and Mitigating Bias in Machine Learning and AI


Analysis & Advanced Analytics

Paper No. Author(s) Paper Title (click for abstract)
AL-015 Jayanth Iyengar Conducting Survival Analysis in SAS using Medicare Claims as a Real-world data source
AL-025 Bruce Lund Logistic Modeling without Split-Samples
AL-031 Brandy Sinco et al. Logistic Regression Odyssey on a Small Sample Due to Rare Cancer
AL-037 Troy Hughes GIS Challenges of Cataloging Catastrophes: Serving up GeoWaffles with a Side of Hash Tables to Conquer Big Data Point-in-Polygon Determination and Supplant SAS PROC GINSIDE
AL-052 Ross Bettinger Feature Selection and Classification Using Fuzzy Logic
AL-056 Danny Modlin PROC BGLIMM: The Smooth Transition to Bayesian Analysis
AL-057 Danny Modlin How to Modify SAS9 Statistics Programs to Run in SAS Viya
AL-058 Danny Modlin Large-Scale Time Series Forecasting in Model Studio
AL-059 Hangcen Zou Kernel Bandwidth Selection for Maximum Mean Discrepancy
AL-079 Ryan Paul Lafler Modern Deep Learning Architectures and Transfer Learning for Tabular, Unstructured, Sequential, and Time-Dependent Data


Banking and Finance

Paper No. Author(s) Paper Title (click for abstract)
BF-051 David Corliss Analysis of Economic Turbulence and Disruptive Events in Finance
BF-083 Rex Pruitt Follow the Blueprint for Success when Migrating Banking and Finance Customers to SAS Viya


Beyond the Basics

Paper No. Author(s) Paper Title (click for abstract)
BB-002 Kirk Paul Lafler SAS Macro Programming: The Basics and Beyond
BB-003 Kirk Paul Lafler Under the Hood: The Mechanics of SQL Query Optimization Techniques
BB-008 Kirk Paul Lafler Creating Custom Excel Spreadsheets with Built-in Autofilters Using PROC REPORT and ODS EXCEL
BB-009 Louise Hadden You've Got Options: Ten Five-Star System Option Hacks
BB-016 Jayanth Iyengar Validate the Code, not just the Data : A System for SAS program evaluation
BB-023 Jayanth Iyengar From %let To %local; Methods, Use, And Scope Of Macro Variables In SAS Programming
BB-024 Kim Wilson Taking the Mystery Out of and Debugging PROC HTTP
BB-029 Stephen Sloan Efficiency Techniques in SAS 9
BB-038 Troy Hughes Undo SAS Fetters with Getters and Setters: Supplanting Macro Variables with More Flexible, Robust PROC FCMP User-Defined Functions That Perform In-Memory Lookup and Initialization Operations
BB-039 Troy Hughes SAS Data-Driven Software Design: How to Develop More Modular, Maintainable, Fixable, Flexible, Configurable, Compatible, Reusable, Readable Software through Independent Control Tables and Other Control Data
BB-040 Ted Conway Fun With SAS and Emoji: What Might a Rebus-Influenced Programming Language Look Like?
BB-042 Vijayasarathy Govindarajan Going from PROC SQL to PROC FedSQL for CAS Processing: Common mistakes to avoid
BB-043 Vijayasarathy Govindarajan Accelerating Your SAS Data Step: Tips and Best Practices for SAS Viya Migration
BB-044 Vijayasarathy Govindarajan SAS Macros and PROC FCMP: A Comparative Inquiry into Reusability and Logic Design
BB-049 LeRoy Bessler Use ODS Excel, ODS PDF, ODS HTML5, ODS LAYOUT
BB-050 David Corliss Cutting Edge Regression Methods: Ridge, LASSO, LOESS, and GAM
BB-063 Jim Box What is Machine Learning, Anyway
BB-065 Jim Box Enhance your Coding Experience with the SAS Extension for VS Code
BB-066 Jim Box How did that Python code get in my SAS program?
BB-067 Josh Horstman
& Richann Watson
From Muggles to Macros: Transfiguring Your SAS Programs with Dynamic, Data-Driven Wizardry
BB-068 Josh Horstman
& Richann Watson
More Muggles, More Macros: Adding Advanced Data-Driven Wizardry to Your SAS Programs
BB-070 Josh Horstman Fifteen Functions to Supercharge Your SAS Code
BB-073 Brian Knepple Enhancing Your PROC REPORT Output: Top Tips
BB-074 Shavonne Standifer Leveraging SQL and SAS for Analysis-Ready Datasets
BB-075 Shavonne Standifer Automate in a Dash with SAS Time-Saving Techniques for Building Quality Improvement Dashboards


Pharma and Life Sciences

Paper No. Author(s) Paper Title (click for abstract)
PL-014 Jayanth Iyengar Applications of PROC COMPARE to Parallel Programming and other projects
PL-017 Richann Watson Have a Date with ISO ? Using PROC FCMP to Convert Dates to ISO 8601
PL-018 Richann Watson Worried about that Second Date with ISO ? Using PROC FCMP to Convert and Impute ISO 8601 Dates to Numeric Dates
PL-022 Crisa Chen
& Jiangang Cai
Mind the Gaps: Automating Multiple Imputation in Clinical Trial Workflows
PL-032 Troy Hughes Last Observation Carried Forward (LOCF) in Longitudinal Clinical Studies: Adopting a Functional Approach to Imputing Missing Values Using PROC FCMP, the SAS Function Compiler
PL-033 Troy Hughes Geocoding with the Google Maps API: Using PROC FCMP To Call User-Defined SAS and Python Functions That Geocode Coordinates into Addresses, Calculate Routes, and More!
PL-046 LeRoy Bessler COVID-19 Explored Using SAS and ODS Graphics: InfoGeographic and Data Graphic Analysis & Pictures
PL-069 Josh Horstman
& Richann Watson
Jazz Up Your Profile: Perfect Patient Profiles in SAS using ODS Statistical Graphics
PL-081 Kelly Olano et al. The Rare Disease Clinical Research Network (RDCRN) works to advance medical research on rare diseases by providing support for clinical studies and facilitating collaboration, study enrollment and data sharing.


Posters

Paper No. Author(s) Paper Title (click for abstract)
PO-011 Louise Hadden ExCITE-ing! Build Your Paper's Reference Section Programmatically Using Lex Jansen's Website and SAS
PO-012 Louise Hadden The World is Not Enough: Base SAS Visualizations and Geolocations
PO-036 Troy Hughes Who's Bringing That Big Data Energy? A 48-Year Longitudinal Analysis of 30,000 Presentations in the SAS User Community To Elucidate Top Contributors and Rising Stars
PO-060 Jimin Lee Market Making Control Problem with Inventory Risk


Visualization and Reporting

Paper No. Author(s) Paper Title (click for abstract)
VR-005 Kirk Paul Lafler Dashboards Made Easy Using SAS Software
VR-010 Louise Hadden The (ODS) Output of Your Desires: Creating Designer Reports and Data Sets
VR-034 Troy Hughes From Word Clouds to Phrase Clouds to Amaze Clouds: A Data-Driven Python Programming Solution To Building Configurable Taxonomies That Standardize, Categorize, and Visualize Phrase Frequency
VR-045 Melinda Macdougall Maps, maps, and more maps using SAS PROC SGMAP!
VR-048 LeRoy Bessler Wise Graphic Design & Color Use for Data Graphics Easily, Quickly, Correctly Understood




Abstracts

AI and Emerging Technology

AE-001 : Designing Against Bias: Identifying and Mitigating Bias in Machine Learning and AI
David Corliss, Peace-Work

Bias in machine learning algorithms is one of the most important ethical and operational issues in statistical practice today. This paper describes common sources of bias and how to develop study designs to measure and minimize it. Analysis of disparate impact is used to quantify bias in existing and new applications. New open-source packages such as Fairlearn and AI Fairness 360 Toolkit quantify bias by automating the measurement of disparate impact on marginalized groups, offering great promise to advance the mitigation of bias. These design strategies are described in detail with examples. Also, a comparison algorithm can be developed that is designed to be fully transparent and without features subject to bias. Comparison to this bias-minimized model can identify areas as bias in other algorithms.


Analysis & Advanced Analytics

AL-015 : Conducting Survival Analysis in SAS using Medicare Claims as a Real-world data source
Jayanth Iyengar, Data Systems Consultants LLC

Applications of Survival analysis as a statistical technique extend to longitudinal studies, and other studies in health research. The SAS/STAT package contains multiple procedures for performing and running survival analysis. The most well-known of these are PROC LIFETEST and PROC PHREG. As a data source, Medicare claims are often used in Real-world evidence studies and observational research. In this paper, survival analysis and the SAS procedures for performing it will be explored, and survival analyses will be conducted using Medicare claims data sets to assess patient's prognosis amongst Medicare beneficiaries.


AL-025 : Logistic Modeling without Split-Samples
Bruce Lund, Statistical Trainer

Here is a quote from: N. Kriegeskorte, et. al. (2009) "Circular analysis in systems neuroscience: the dangers of double dipping", Nature Neuroscience "Double Dipping is the use of the same dataset for selection and selective analysis. It gives distorted descriptive statistics and invalid statistical inference" Double Dipping would arise if a logistic model was fit to an Analysis dataset and the same Analysis dataset was used for computing model validation statistics (e.g. c-statistic, average squared error, etc.). The long standing approach to avoid double dipping is usage of split-sampling. In split-sampling the Analysis dataset is randomly divided into Training and Validation datasets. The split is often 50%-50% but could be 60%-40% or 70%-30%. The focus of this paper is on binary logistic modeling. In split sampling, a logistic model is fit on Training, without ever looking at the Validation dataset. Once a final model is fitted, then model performance is measured on the Validation dataset. Can the problem of double dipping be avoided without a split-sample? If so, this would have the advantage of fitting the model on the entire Analysis dataset, giving better predictor variable selection and better coefficient estimation. But this leaves open the question of how to perform model validation. It is a purpose of the paper to show how split-sampling can be avoided. Briefly, this approach involves the usage of bootstrap sampling to find an "optimism correction". This "optimism correction" is an adjustment to performance metrics (e.g., c-statistic, average squared error, etc.) that are computed on the full Analysis dataset. That is, the model is fitted on the Analysis dataset. The performance metrics are also computed on the Analysis dataset, but these performance metrics are then corrected by an "optimism correction". The paper explains how bootstrap sampling is utilized in finding the optimism correction. Optimism Correction is presented in a book by Efron and Tibshirani (1993), An Introduction to the Bootstrap, pp 247-252. In recent years, F. Harrell and E. Steyerberg have championed this approach (see references in paper). This paper uses SAS 9.4 Base and SAS/STAT. Audience will have medium SAS programming skill and familiarity with logistic regression.


AL-031 : Logistic Regression Odyssey on a Small Sample Due to Rare Cancer
Brandy Sinco, University of Michigan
Jessie Dalman, University of Michigan
Tasha Hughes, University of Michigan

SAS Products: SAS/STAT 9.4, Procs Power, Freq, Logistic, GenMod, BGLIMM Skill Level: Statistician, bachelors degree and above Intended Audience: Statisticians, Anyone involved in data analysis Background: Due to Leiomyosarcoma (LMS) being a rare cancer, a recurrence study had a small N. Based on prior medical experience, surgical oncologists hypothesized that recurrence would be higher among patients with sub-cutaneous LMS subtype than with cutaneous LMS. Because of a dataset with a small N and a prior hypothesis, we expected Bayesian logistic regression to be key to finding a credible interval for recurrence. Methods: First, power was calculated by using SAS Proc Power. As the bio-statistician suspected, the power to detect a 10% difference in recurrence between patients with cutaneous and sub-cutaneous LMS was <80%. The analysis began by looking at individual predictors of recurrence via logistic regression and then by using a multi-variable logistic regression. For variables with small cell sizes x recurrence, the Firth adjustment for computing maximum likelihood estimates was used. Before finalizing the multi-variable logistic regression model, multi-collinearity was evaluated with the variance inflation factor. Next, the results of the classical logistic regression model were confirmed with Bayesian logistic regression. We turned to Bayesian logistic regression because the Bayesian algorithm does not rely on asymptotic statistics from a large sample. The numbers of burn-in and Monte Carlo repetitions were selected to generate Geweke diagnostics to indicate similar means in the beginning and end of the Markov chain and for the proportion of variance due to Monte Carlo simulation to be 2.5%. The thinning parameter was chosen to obtain an auto-correlation time of <3 lags. These diagnostics were to be confirmed with stable trace, auto-correlation, and posterior density plots. Results: The original dataset contained N = 116 LMS patients. Seventy two patients had cutaneous LMS and 44 patients had sub-cutaneous LMS. In this initial dataset, 3 (4.2%) of the cutaneous patients experienced recurrence, compared to 9 (7.8%) of sub-cutaneous patients, corresponding to p = .081 with the Fisher exact test. From logistic regression, patients with sub-cutaneous extension had an odds ratio (OR) = 3.63 (0.86, 15.35); p = .080 and an adjusted odds ratio (AOR) = 4.49 (0.89, 22.74); p = 0.070 for recurrence, compared to patients with cutaneous tumor only. In the Bayesian a nalysis, we reported both of the 95% credible intervals, using the equal tailed and highest posterior density methods. Both credible intervals indicated over 95% probability of tumor sub-type being a key predictor of tumor recurrence. Tumor sub-type was the only predictor that produced credible intervals for the odds ratios with all sides above 1. The equal-tailed 95% credible interval was 5.48 (1.10, 29.74) and the highest posterior density (hpd) credible interval was 5.48 (1.06, 28.20). As the analysis progressed, a surgical oncologist determined that one patient needed to be excluded and another patient had been mis-classified with cutaneous LMS. This final sample had N = 115. While the analyst expected re-running the analyses to be simple and straight-forward, there were some surprises. First, the comparisons of recurrence became statistically significant at p<.05. The recurrence rates became 1 (1.4%) among cutaneous patients, compared to 7 (15.6%) among sub-cutaneous patients, p = 0.006 with the Fisher exact test. Second, classical logistic regression produced a p-value < .05, but with a wide confidence interval. Third, Bayesian logistic regression generated 95% credible intervals that contained odds ratios > 1, although the intervals were much wider than before. Conclusion: Both classical logistic regression and Bayesian logistic regression indicated that patients with the sub-cutaneous subtype had higher odds of recurrence, compared to patients the cutaneous sub-type. In situations where the sample size is small due to a rare disease, both the Firth option in classical logistic regression and Bayesian logistic regression are useful tools to confirm that a variable is an important predictor.


AL-037 : GIS Challenges of Cataloging Catastrophes: Serving up GeoWaffles with a Side of Hash Tables to Conquer Big Data Point-in-Polygon Determination and Supplant SAS PROC GINSIDE
Troy Hughes, Data Llama Analytics

The GINSIDE procedure represents the SAS solution for point-in-polygon determination that is, given some point on earth, does it fall inside or outside of one or more bounded regions? Natural disasters typify geospatial data the coordinates of a lightning strike, the epicenter of an earthquake, or the jagged boundary of an encroaching wildfire yet observing nature seldom yields more than latitude and longitude coordinates. Thus, when the United States Forestry Service needs to determine in what zip code a fire is burning, or when the United States Geological Survey (USGS) must ascertain the state, county, and city in which an earthquake was centered, a point-in-polygon analysis is inherently required. It determines within what boundaries (e.g., nation, state, county, federal park, tribal lands) the event occurred, and confers boundary attributes (e.g., boundary name, area, population) to that event. Geographic information systems (GIS) that process raw geospatial data can struggle with this time-consuming yet necessary analytic endeavor the attribution of points to regions. This text demonstrates the tremendous inefficiency of the GINSIDE procedure, and promotes GeoWaffles as a far faster alternative that comprises a mesh of rectangles draped over polygon boundaries. This facilitates memoization by running point-in-polygon analysis only once, after which the results are saved to a hash object for later reuse. GeoWaffles debuted in the 2013 white paper Winning the War on Terror with Waffles: Maximizing GINSIDE Efficiency for Blue Force Tracking Big Data (Hughes, 2013), and this text represents an in-memory, hash-based refactoring. All examples showcase USGS tremor data as GeoWaffles tastefully blow GINSIDE off the breakfast buffet processing coordinates more than 25 times faster than the out-of-the-box SAS solution!


AL-052 : Feature Selection and Classification Using Fuzzy Logic
Ross Bettinger, Consultant

We investigate the use of fuzzy logic as applied to feature selection and classification. Fuzzy logic, a generalization of classical set theory, can be useful in situations where there is imprecision or vague-ness in the problem domain. Fuzzy logic is applied to transform input data into fuzzy sets that are then suitable for processing by a feature selection algorithm. A fuzzy entropy measure is used to perform classification using a similarity classifier.


AL-056 : PROC BGLIMM: The Smooth Transition to Bayesian Analysis
Danny Modlin, SAS

Many analysts are interested in taking models they currently have and transitioning them to the Bayesian realm. Most leap from their favorite classical analysis procedure directly to PROC MCMC, the general-purpose Bayesian procedure. This presentation will feature the BGLIMM procedure available since SAS/STAT 15.1. This will allow the participant to model non-normal responses and include random effects within their Bayesian approach. Discussion will include options of priors and availability of statements. Examples will include models originally written in PROCs REG, GLM, GLMSELECT, GENMOD, MIXED, and GLIMMIX.


AL-057 : How to Modify SAS9 Statistics Programs to Run in SAS Viya
Danny Modlin, SAS

How can existing SAS 9 programs can be modified to execute in SAS Viya. Code can either run as is on the SAS Compute Server, or it can be modernized to process data in memory and in parallel on the SAS Cloud Analytic Services (CAS) server. This presentation is perfect for programmers who are new to SAS Viya and want to continue performing their statistical analyses there. We will address questions that are typically asked. Will existing SAS 9 code work in Viya? How must my programs change to take advantage of the new features in Viya?


AL-058 : Large-Scale Time Series Forecasting in Model Studio
Danny Modlin, SAS

In this workshop, you learn to build time series models for large-scale time series problems with many hierarchically related series. You will experience the capability of Model Studio to diagnose, fit, and assess models for many time series at once. Use the new Hierarchical Modeling Node to create time series models at each of the levels of the hierarchy. Need to extract your reconciled predictions from each level of the hierarchy? No problem. Within the Hierarchical Modeling Node, you can dive into each level of the hierarchy and export these desired predictions.


AL-059 : Kernel Bandwidth Selection for Maximum Mean Discrepancy
Hangcen Zou, Washington University in St. Louis

Distributional shifts between training and testing data can severely affect the performance of machine learning models, making their detection a critical task. The kernel two-sample test based on maximum mean discrepancy (MMD) is a widely adopted approach for this purpose. However, its effectiveness depends heavily on the choice of kernel bandwidth. This project investigates the influence of bandwidth on MMD performance in streaming data contexts and offers practical guidance for efficient bandwidth selection. Through extensive simulations, we assess the robustness and sensitivity of both the standard MMD and a Mahalanobis-aggregated variant under a variety of data-generating conditions. Multiple bandwidth selection strategies are evaluated and compared to inform best practices in real-time detection tasks.


AL-079 : Modern Deep Learning Architectures and Transfer Learning for Tabular, Unstructured, Sequential, and Time-Dependent Data
Ryan Paul Lafler, Premier Analytics Consulting, LLC

Deep learning (DL) offers powerful architectures for extracting patterns and representations from diverse data types, including tabular datasets, images, text, audio, and time-series. This presentation provides a high-level survey of modern deep learning architectures developed with Python's Keras API for TensorFlow, including Artificial Neural Networks (ANNs) for structured data, Convolutional Neural Networks (CNNs) for image and spatial data, and sequence modeling architectures such as recurrent neural networks (RNNs), gated recurrent units (GRUs), long short-term memory networks (LSTMs), and transformer-based models. The focus is on how these architectures address different problem domains and data modalities, with minimal coverage of foundational theory. A key part of this discussion highlights transfer learning as a strategy for leveraging pre-trained models and fine-tuning them for new tasks, enabling faster development and improved performance across applications. By combining the right architecture with transfer learning techniques, AI engineers can accelerate solutions for classification, regression, forecasting, and generative tasks.


Banking and Finance

BF-051 : Analysis of Economic Turbulence and Disruptive Events in Finance
David Corliss, Peace-Work

2025 has been a year of considerable economic disruption, resulting in widespread financial uncertainty. This paper presents methods for analyzing these events. Drawing on the mathematics of Chaos Theory, the paper describes the SAS procedures and options suited to predicting the results of rare, extreme events. Methods include Unobserved Compoments models, logistic maps, and data visualizations to describe, understand, and predict to the degree possible to the outcomes of disruptive events in economics and finance.


BF-083 : Follow the Blueprint for Success when Migrating Banking and Finance Customers to SAS Viya
Rex Pruitt, SAS

Migrating financial services organizations from legacy SAS 9.4 environments or open-source platforms like Python/R to SAS Viya requires a strategic, well-orchestrated approach. This session presents a proven blueprint for success, highlighting key considerations, tools, and best practices to ensure a smooth transition. Attendees will learn how to assess and migrate SAS Enterprise Guide projects into SAS Studio Flows, adapt open-source scripts for Viya compatibility, and leverage SAS tools such as Content Assessment and Enterprise Session Monitor to streamline migration efforts. The presentation outlines a step-by-step checklist for both SAS and open-source migrations, emphasizing environment preparation, code adaptation, validation, and operationalization. Additionally, it explores common pitfalls such as misaligned usage education and inadequate partner support and how to avoid them. With a focus on enabling full-scope capability adoption, this session equips pre-sales and technical teams with the insights needed to deliver value quickly and confidently in Viya. Whether you're migrating EG projects or SageMaker pipelines, this blueprint ensures your path to success is clear, efficient, and scalable.


Beyond the Basics

BB-002 : SAS Macro Programming: The Basics and Beyond
Kirk Paul Lafler, SasNerd

The SAS Macro Language is a powerful feature for extending the capabilities of the SAS System. This paper highlights a collection of techniques for constructing reusable and effective macros tools. Attendees are introduced to the techniques associated with building functional macros that process statements containing SAS code; design reusable macro techniques; create macros containing keyword and positional parameters; utilize defensive programming tactics and techniques; build a library of macro utilities; interface the macro language with the SQL procedure; and develop efficient and portable macro language code.


BB-003 : Under the Hood: The Mechanics of SQL Query Optimization Techniques
Kirk Paul Lafler, SasNerd

The SAS software and SQL procedure provide powerful features and options for users to gain a better understanding of what's taking place during query processing. This presentation explores the fully supported SAS MSGLEVEL=I system option and PROC SQL _METHOD option to display valuable informational messages on the SAS Log about the SQL optimizer's execution plan as it relates to processing SQL queries; along with an assortment of query optimization techniques.


BB-008 : Creating Custom Excel Spreadsheets with Built-in Autofilters Using PROC REPORT and ODS EXCEL
Kirk Paul Lafler, SasNerd

Spreadsheets have become the most popular and successful data tool ever conceived. Current estimates show that there are more than 750 million Excel users worldwide. A spreadsheet's simplicity and ease of use are two reasons for the growth and widespread use of Excel around the globe. Additional value-added features have also helped to expand the spreadsheet usefulness among a growing number of users including its collaborative capabilities, being customizable, ability to manipulate data, application of data visualization techniques, mobile device usage, automation of repetitive tasks, integration with other software, data analysis, and filtering capabilities using autofilters. This last value-added feature, filtering with autofilters, is the theme for this paper. An example application will be illustrated that creates a custom Excel spreadsheet with built-in autofilters, or filters that provide users with the ability to make choices from a list of text, numeric, or date values to find data of interest quickly, using the SAS Output Delivery System (ODS) Excel destination and the REPORT procedure.


BB-009 : You've Got Options: Ten Five-Star System Option Hacks
Louise Hadden, Cormac Corporation

SAS provides myriad opportunities for customizing programs and processes, including a wide variety of system options that can control and enhance SAS code from start to finish. This paper and presentation demonstrates methods of obtaining information on SAS system options and moves on to fully explicate ten SAS system option hacks, from COMPRESS to VALIDVARNAME. System options are highly dependent on platforms, security concerns, SAS versions and products: dependencies and defaults will be discussed. SAS practitioners will gain a deeper understanding of the powerful SAS system options they have seen, used, and automatically included in their code. This presentation is suitable for all skill and experience levels; platform and implementation differences are part of the discussion.


BB-016 : Validate the Code, not just the Data : A System for SAS program evaluation
Jayanth Iyengar, Data Systems Consultants LLC

Regardless of the industry they work in, SAS programmers are focused on validating data, and devote a considerable amount of attention to the quality of data, whether its raw source data, submitted SAS data sets, or SAS output, including figures and listings. No less important is the validity of code and the SAS programs which extract, manipulate, and analyze data. Although code validity can be assessed through the SAS log, there other ways to produce metrics on code validity. This paper introduces a system for SAS program validation which produces useful information on lines of code, number of data steps, total run and CPU time and other metrics for project-related SAS programs.


BB-023 : From %let To %local; Methods, Use, And Scope Of Macro Variables In SAS Programming
Jayanth Iyengar, Data Systems Consultants LLC

Macro variables are one of the powerful capabilities of the SAS system. Utilizing them makes your SAS code more dynamic. There are multiple ways to define and reference macro variables in your SAS code; from %LET and CALL SYMPUT to PROC SQL INTO. There are also several kinds of macro variables, in terms of scope and other ways. Not every SAS programmer is knowledgeable about the nuances of macro variables. In this paper, I explore the methods for defining and using macro variables. I also discuss the nuances of macro variable scope, and the kinds of macro variables from user-defined to automatic.


BB-024 : Taking the Mystery Out of and Debugging PROC HTTP
Kim Wilson, SAS

Several great papers have been written about how to get started with PROC HTTP, which includes accessing Microsoft 365 applications, modifying various options for desired results, and more. As a SAS Technical Support Engineer, I often assist SAS customers who are not receiving the expected resource, or they are seeing a return code that is not a 200 OK. This paper describes common errors that you might encounter regarding certificates, authentication, and general errors, as well as overall debugging techniques and suggestions. This paper also helps you gather pertinent information that SAS Technical Support will need when helping to solve the problems occurring with or around PROC HTTP. This is applicable for SAS 9.4 and above.


BB-029 : Efficiency Techniques in SAS 9
Stephen Sloan, Dawson D R

Using space and time efficiently has always been important to organizations and to programmers in general and to SAS programmers in particular. We want to be able to use our available space without having to obtain new servers or other hardware resources, and without having to delete variables or observations to make the SAS data sets fit into the available space. We also want our jobs to run more quickly both to reduce waiting times and also to ensure that scheduled job streams finish on time and that successor jobs are not unnecessarily delayed. Internal mainframe billing algorithms have always rewarded efficiency. As we move toward cloud computing efficiency will become even more important because the billing algorithms in cloud environments charge for every byte and every CPU second, putting an additional financial premium on efficiency. Sometimes we are in a hurry to get our jobs done on time, so we don't initially pay attention to efficiency, sometimes we don't know at the start of a project how much time and space our jobs will use (and the important time is the time allocated to our assignment), and sometimes we're asked to go into existing jobs and make changes that are seemingly incremental but can cause large increases in the amount of space and/or time that is required. Finally, there can be jobs that have been running for a long time and we take the attitude "if it ain't broke, don't fix it" because we don't want to cause the programs to stop working, especially if they're not well-documented. With a reasonably good knowledge of SAS Base, there are things we can do that can help our organizations optimize the use of space and time and run more quickly without causing any loss of observations or variables and without changing the results of the programs.


BB-038 : Undo SAS Fetters with Getters and Setters: Supplanting Macro Variables with More Flexible, Robust PROC FCMP User-Defined Functions That Perform In-Memory Lookup and Initialization Operations
Troy Hughes, Data Llama Analytics

Getters and setters are common in some object-oriented programming (OOP) languages such as C++ and Java, where "getter" functions retrieve values and "setter" functions initialize (or modify) variables. In Java, for example, getters and setters are constructed as conduits to private classes, and facilitate data encapsulation by restricting variable access. Conversely, the SAS language lacks classes, so SAS global macro variables are typically utilized to maintain and access data across multiple DATA steps and procedures. Unlike an OOP program that can categorize variables across multiple user-defined classes, however, SAS maintains only one global symbol table in which global macro variables can be maintained. Additionally, maintaining and accessing macro variables can be difficult when quotation marks, ampersands, percentage signs, and other special characters exist in the data. This text introduces user-defined getter functions and setter subroutines designed using the FCMP procedure, which enable data lookup and initialization operations to be performed within DATA steps. Among other benefits, user-defined getters and setters can facilitate the evaluation of complex Boolean logic expressions that leverage data stored across multiple data sets all concisely performed in a single SAS statement! Getters and setters are thoroughly demonstrated in the author's text: PROC FCMP User-Defined Functions: An Introduction to the SAS Function Compiler. (Hughes, 2024)


BB-039 : SAS Data-Driven Software Design: How to Develop More Modular, Maintainable, Fixable, Flexible, Configurable, Compatible, Reusable, Readable Software through Independent Control Tables and Other Control Data
Troy Hughes, Data Llama Analytics

Data-driven design describes software design in which the control logic, program flow, business rules, data models, data mappings, and other dynamic and configurable elements are abstracted to control data that are interpreted by (rather than contained within) code. Thus, data-driven design leverages parameterization and external data structures (including configuration files, control tables, decision tables, data dictionaries, business rules repositories, and other control files) to produce dynamic software functionality. This hands-on workshop introduces real-world scenarios in which the flexibility, configurability, reusability, and maintainability of SAS software are improved through data-driven design methods, as introduced in the author's 2022 book: SAS Data-Driven Development: From Abstract Design to Dynamic Functionality, Second Edition. Scenarios install the attendee as the newest intern at Scranton, Pennsylvania's, most infamous paper supply company, tasked with converting legacy, hardcoded SAS programs into flexible, configurable data-driven design. Come help Jim, Dwight, Stanley, and Phyllis sell more paper--and learn data-driven best practices in the process!


BB-040 : Fun With SAS and Emoji: What Might a Rebus-Influenced Programming Language Look Like?
Ted Conway, Self

Remember those fun Highlights for Children stories in which words were replaced with pictures to help and engage young readers? Ever wonder what that might look like in a programming language? In this session, we'll not only take a whimsical look at some examples of rebus-flavored SAS and SQL code snippets but also demonstrate rudimentary SAS and Python preprocessors that translate emoji into executable code using SAS's Unicode string 'K' functions and the Python regex package. This session is intended for all SAS users.


BB-042 : Going from PROC SQL to PROC FedSQL for CAS Processing: Common mistakes to avoid
Vijayasarathy Govindarajan, SAS Institute

SAS 9 customers are increasingly looking at moving to SAS Viya to harness the power of the new distributed, in-memory, Cloud Analytic Services (CAS) engine. This often helps to speed up existing processes many times over and run analytics on huge datasets faster. One of the key areas of this migration involves updating SAS 9 PROC SQL code to take advantage of the processing capabilities of CAS. This is made possible by a new(er) procedure in the SAS arsenal: PROC FedSQL. There are many differences between PROC SQL and PROC FedSQL for CAS, from supported data types, available functions, applying formats, quoting strings, referencing macro variables etc. In my experience, users new to SAS Viya often make mistakes while migrating code to FedSQL which arise from a few basic misconceptions. This paper aims to clarify the key differences between PROC SQL and PROC FedSQL for CAS. It will also highlight common mistakes when adapting SQL code for CAS, offering guidance on how to avoid them. The goal is to help users leverage the power of CAS effectively without getting bogged down by a lengthy process of fixing small, easily preventable errors when converting code to FedSQL.


BB-043 : Accelerating Your SAS Data Step: Tips and Best Practices for SAS Viya Migration
Vijayasarathy Govindarajan, SAS Institute

The SAS Viya platform provides cloud-enabled, in-memory, and parallel processing capabilities for advanced analytics. While a "lift and shift" approach allows SAS 9 Data Step code to run in SAS Viya, realizing significant performance improvements requires targeted code adjustments. For a successful migration, SAS programmers need to understand the SAS Viya architecture, its similarities and differences with SAS 9, and effective conversion guidelines. This presentation offers practical tips for migrating Data Step code, including optimizing in-memory processing, handling BY-group processing, addressing function and format nuances, incorporating open-source language support, and utilizing performance options. These insights aim to help programmers leverage the full potential of SAS Viya while avoiding common migration challenges. This presentation hopes to equip attendees with actionable insights to transition their SAS 9 Data Step code and ensuring they harness Viya's advanced capabilities to achieve optimal performance outcomes.


BB-044 : SAS Macros and PROC FCMP: A Comparative Inquiry into Reusability and Logic Design
Vijayasarathy Govindarajan, SAS Institute

SAS provides two powerful tools for enhancing code reuse and maintainability: the macro language (%MACRO) for code generation, and PROC FCMP for encapsulating reusable logic. While they may appear to offer overlapping functionality at first glance, they differ fundamentally in both purpose and execution timing. This session presents a comparative analysis of %MACRO and PROC FCMP, focusing on their distinct roles in modular program design. A clear understanding of these differences helps avoid the common misuse of macros for logic that is more appropriately handled by FCMP functions. The session will review typical approaches to creating and using macros, including function-style macros, and demonstrate how PROC FCMP functions and subroutines are defined and executed. Examples will illustrate differences in execution timing and clarify misconceptions about functional overlap. The presentation aims to provide a solid understanding of when and how to use SAS Macros and PROC FCMP, leading to cleaner, more modular, and maintainable SAS programs.


BB-049 : Use ODS Excel, ODS PDF, ODS HTML5, ODS LAYOUT
LeRoy Bessler, Bessler Consulting and Research

The three most frequently used ODS "destinations" are ODS Excel, ODS PDF, and ODS HTML5 (current successor to, and better than, ODS HTML). The fourth most frequently used Should Be / Could Be ODS LAYOUT (within ODS HTML5 or ODS PDF). With it, Anything Anywhere All At Once is possible for composites of tables, graphs, and text. Get acquainted with and get started with any destination you don't already know, or learn some tips to get more value from what you do already know. This is a tour and an introduction.


BB-050 : Cutting Edge Regression Methods: Ridge, LASSO, LOESS, and GAM
David Corliss, Peace-Work

This paper presents a brief introduction to recent advances in regression methods. Techniques demonstrated include ridge regression, LASSO, local polynomial regression (LOESS), and generalized additive models (GAM). Each method is presented separately, with a description of the SAS procedure used to implement them and recommendations for apply the methods in practical situations. A quick introduction to each method followed by two worked examples, with discussion of use cases, and options for SAS procedures and producing graphical output.


BB-063 : What is Machine Learning, Anyway
Jim Box, SAS Institute

Machine Learning models are the backbones of Artificial Intelligence systems, and as such are being talked about everywhere, but do you know how they work? In this session, we'll look at the process of using machine learning models, cover some of the terminology, and discuss how they are different than the standard statistical models you might be more familiar with .


BB-065 : Enhance your Coding Experience with the SAS Extension for VS Code
Jim Box, SAS Institute

Visual Studio Code (VS Code) from Microsoft is an open-source code editor that is very popular among developers for its ease of use across all programming languages which is driven by a robust extension ecosystem. The SAS VS Code extension is an open-source, freely available add-on that allows you to use VS Code to connect to any modern SAS Environment, from SAS 9.4 on your local machine to SAS Viya in the cloud. The key features include Syntax Highlighting, Code Completion, Syntax Help, Data Viewer, and my favorite, SAS Notebooks, which offer an exciting way to share content and comments. We'll look at the extension, how to use it, and explore how you can get involved with the direction of how this product evolves.


BB-066 : How did that Python code get in my SAS program?
Jim Box, SAS Institute

Did you know that you can execute Python code inside a SAS Program? With the SAS Viya Platform, you can call PROC PYTHON and pass variables and datasets easily between a Python call and a SAS program. In this paper, we will look at ways to integrate Python in your SAS Programs.


BB-067 : From Muggles to Macros: Transfiguring Your SAS Programs with Dynamic, Data-Driven Wizardry
Josh Horstman, PharmaStat LLC
Richann Watson, DataRich Consulting

The SAS macro facility is an amazing tool for creating dynamic, flexible, reusable programs that can automatically adapt to change. This presentation uses a series of examples to demonstrate how to transform static "muggle" code full of hardcodes and data dependencies by adding macro language magic to create data-driven programming logic. Don't hardcode data values into your programs. Cast a vanishing spell on data dependencies and let the macro facility write your SAS code for you!


BB-068 : More Muggles, More Macros: Adding Advanced Data-Driven Wizardry to Your SAS Programs
Josh Horstman, PharmaStat LLC
Richann Watson, DataRich Consulting

In their popular 2024 presentation "From Muggles to Macros", Horstman and Watson delivered a spell book full of macro magic to enhance SAS programs with data-driven wizardry. This exciting sequel to that enchanting performance adds to the list of incantations for creating dynamic, flexible, reusable programs that can automatically adapt to change. New charms include the use of control tables, the CALL EXECUTE routine, and of course, more macro language techniques. Don't hardcode data values into your programs. Cast a vanishing spell on data dependencies and let the macro facility write your SAS code for you!


BB-070 : Fifteen Functions to Supercharge Your SAS Code
Josh Horstman, PharmaStat LLC

The number of functions included in SAS software has exploded in recent versions, but many of the most amazing and useful functions remain relatively unknown. This paper will discuss such functions and provide examples of their use. Both new and experienced SAS programmers should find something new to add to their toolboxes.


BB-073 : Enhancing Your PROC REPORT Output: Top Tips
Brian Knepple, J & J MedTech

To get the best results with SAS PROC REPORT, start by choosing the variables you want to display and decide how to organize your data through grouping or sorting. Use the "DEFINE" option to assign labels, formats, and calculations for columns. Apply "BREAK" and "RBREAK" to add summaries and totals for better clarity. Use "COMPUTE" to perform custom calculations or to highlight specific data points. Additionally, the "STYLE" option can help you enhance the appearance of your report. By combining these features, you can create clear, visually appealing reports that effectively communicate your data.


BB-074 : Leveraging SQL and SAS for Analysis-Ready Datasets
Shavonne Standifer, Statistical Business Analyst

Data professionals often use a combination of various technologies. Effective data management is essential for ensuring high-quality analysis and decision-making. SQL is a powerful language for querying and manipulating relational databases, while SAS offers a suite of advanced tools for data preparation, statistical analysis, and reporting. By integrating these technologies, organizations can improve data management, enhance data integrity, and foster collaboration across teams. This paper provides a general guide to utilizing SQL and SAS programming to efficiently create, manage, and maintain analysis-ready datasets. Through step-by-step instructions and real-world examples, readers will gain the skills needed to harness the power of structured query language (SQL) and SAS for streamlining data processes within their organizations.


BB-075 : Automate in a Dash with SAS Time-Saving Techniques for Building Quality Improvement Dashboards
Shavonne Standifer, Statistical Business Analyst

Building programs that leverage the analytic and reporting powers of SAS reduce the time to solution for critical tasks. This paper discusses, through example, how Base SAS tools, such as FILENAME, MACROS, and ODS combined with the "built in scheduler" housed in SAS Enterprise Guide can be used to automate the process between raw data to dashboard view quickly and efficiently. This paper is divided into two main parts. In the first part, we discuss how to use scripting language to bring data from a file location into the SAS environment, how to build programs that clean and subset the data, and how to transform the process with a MACRO. In the conclusion, we will discuss how to use SAS procedures and the ODS to transform the resulting data into a quality improvement dashboard view that can be set to automatically run and send to team members at a scheduled time.


Pharma and Life Sciences

PL-014 : Applications of PROC COMPARE to Parallel Programming and other projects
Jayanth Iyengar, Data Systems Consultants LLC

PROC COMPARE is a valuable BASE SAS procedure which is used heavily in the Pharma industry and other areas. By default, the capability of PROC COMPARE is to reconcile two data sets to determine if they have equivalent sets of records and sets of variables. In the clinical field and elsewhere, PROC COMPARE is often used to validate data sets in projects which involve parallel programming, where programmers independently perform the same tasks. In this paper, I will discuss the role PROC COMPARE plays in different SAS tasks, including DATA STEP merges, parallel programming, generation data sets, and more.


PL-017 : Have a Date with ISO ? Using PROC FCMP to Convert Dates to ISO 8601
Richann Watson, DataRich Consulting

Programmers frequently have to deal with dates and date formats. At times, determining whether a date is in a day-month or month-day format can leave us confounded. Clinical Data Interchange Standards Consortium (CDISC) has implemented the use of the International Organization for Standardization (ISO) format, ISO 8601, for datetimes in SDTM domains, to alleviate the confusion. However, converting "datetimes" from the raw data source to the ISO 8601 format is no picnic. While SAS has many different functions and CALL subroutines, there is not a single magic function to take raw datetimes and convert them to ISO 8601. Fortunately, SAS allows us to create our own custom functions and subroutines. This paper illustrates the process of building a custom function with custom subroutines that takes raw datetimes in various states of completeness and converts them to the proper ISO 8601 format.


PL-018 : Worried about that Second Date with ISO ? Using PROC FCMP to Convert and Impute ISO 8601 Dates to Numeric Dates
Richann Watson, DataRich Consulting

Within the life sciences, programmers often find themselves doing a lot of dating matching, converting between character and numeric values, and imputing missing components. Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM) domains have implemented the use of the International Organization for Standardization (ISO) format, ISO 8601, for datetimes. These dates are stored as character strings with missing components denoted using a single hyphen. Although this format helps to standardize how dates and times are captured so that there is no confusion as to what the date represents, it leaves a longing for something more compatible for analysis purposes: determining durations and number of days from a reference point. The conversion of ISO dates to a numeric format requires a serious commitment, especially when partial ISO dates require imputations. Although SAS offers a variety of built-in formats and functions that can appease both sides on a date, i.e., converting complete ISO dates to numeric values or numeric dates to ISO dates, there is no SAS-provided function that will help with the required conversion and imputation of a partial date. Fortunately, with the use of the FCMP procedure, we can create our own custom functions to help achieve our desired goal. This paper illustrates the process of building a custom function that will take a date that is captured in the appropriate ISO format in SDTM (--DTC) and convert that date to a numeric format while also giving partial dates the extra attention to impute missing components. Additionally, this custom function sets the correct date imputation variable (--DTF), so you always know just how much of a blind date your derived value really is.


PL-022 : Mind the Gaps: Automating Multiple Imputation in Clinical Trial Workflows
Crisa Chen, Eli Lilly and Company
Jiangang Cai, Eli Lilly and Company

Handling missing data is a persistent challenge in clinical trials, particularly when it occurs across multiple endpoints, timepoints, or treatment periods. Beyond the standard assumption of Missing At Random (MAR), real-world trial data often include sporadically missing values, structurally missing patterns, or missingness driven by intercurrent events (ICE), such as treatment discontinuation, rescue medication use, or major protocol deviations. These situations require careful pre-imputation processing to avoid biased estimates or invalid imputations. Without a systematic and scalable approach, implementing Multiple Imputation (MI) across diverse datasets can become error-prone, labor-intensive, and difficult to reproduce. To address these challenges, we developed a modular and parameter-driven SAS macro pipeline that automates the MI process within the ADaM data workflow. This solution standardizes the handling of intercurrent events, supports flexible imputation strategies, and streamlines the generation of analysis-ready outputs. By embedding MI into a structured pipeline, the approach enhances consistency, scalability, and reproducibility across studies, while reducing manual effort and programming variability.


PL-032 : Last Observation Carried Forward (LOCF) in Longitudinal Clinical Studies: Adopting a Functional Approach to Imputing Missing Values Using PROC FCMP, the SAS Function Compiler
Troy Hughes, Data Llama Analytics

Last observation carried forward (LOCF) is a ubiquitous method of imputing missing values in longitudinal studies, and is commonly implemented when a subject (i.e., a patient or participant) misses a scheduled visit, and data cannot be collected (or generated). In general, the last "valid" value from a previous visit is retained for the later visit on which the data could not be obtained, and this conservative estimation succeeds in cases where the actual value would have been little changed. Nuanced criteria may stipulate which prior values count as "valid" (e.g., after the start of treatment) as well as for how long (e.g., how many days, visit weeks, consecutive missed visits) a value can be used to impute other values. Given these complexities, LOCF solutions implemented in SAS historically adopt a procedural approach, and often require multiple DATA steps and/or procedures to impute data both across observations and within subjects. Conceptually, however, a functional approach can be envisioned in which LOCF could be calculated using a function call that is, delivering the same functionality through a single line of code while hiding (abstracting) the complexity of the calculation inside the function's definition. The FCMP procedure can deliver this functionality, enabling SAS practitioners to build user-defined functions even those that perform inter-observation calculations and this text demonstrates a user-defined subroutine that dynamically calculates LOCF while relying on CDISC and ADaM standards, data structures, and nomenclature. The software design concepts herein are adapted from the renowned SAS Press book: PROC FCMP User-Defined Functions: An Introduction to the SAS Function Compiler. (Hughes, 2024)


PL-033 : Geocoding with the Google Maps API: Using PROC FCMP To Call User-Defined SAS and Python Functions That Geocode Coordinates into Addresses, Calculate Routes, and More!
Troy Hughes, Data Llama Analytics

Software interoperability describes the ability of software systems, components, and languages to communicate effectively with each other, and must be prioritized in today's multilingual and open-source development environments. PROC FCMP, the SAS Function Compiler, enables Python functions to be wrapped in (and called from) SAS user-defined functions (and subroutines), and the full panoply of SAS environments supports FCMP including SAS Display Manager, SAS Enterprise Guide, SAS Studio, SAS Viya, and the latest Cary show pony, SAS Workbench. Productivity and the pace of development are maximized when existing open-source code such as Python user-defined functions can be run natively from a Python interactive development environment (IDE) rather than having to be needlessly recoded into the Base SAS language. This text demonstrates SAS and Python user-defined functions that collaboratively call the Google Maps Platform APIs to geocode street addresses into latitude/longitude coordinates, and to calculate driving distances between locations. The scenarios in this text demonstrate the application of the Google Maps Platform for clinical trials research, and the technical concepts are adapted from Chapter 8 of the renowned SAS Press book: PROC FCMP User-Defined Functions: An Introduction to the SAS Function Compiler. (HUGHES, 2024)


PL-046 : COVID-19 Explored Using SAS and ODS Graphics: InfoGeographic and Data Graphic Analysis & Pictures
LeRoy Bessler, Bessler Consulting and Research

See the COVID-19 data for 2020 through 2023 analyzed and displayed in ways nowhere else available. Where was it? How severe? What might have contributed to it? Did hot spots cause it to proliferate? Come and see what the data and its visualization reveals.


PL-069 : Jazz Up Your Profile: Perfect Patient Profiles in SAS using ODS Statistical Graphics
Josh Horstman, PharmaStat LLC
Richann Watson, DataRich Consulting

Patient profiles are often used to monitor the conduct of a clinical trial, detect safety signals, identify data entry errors, and catch protocol deviations. Each profile combines key data collected regarding a single subject everything from dosing to adverse events to lab results. In this presentation, two experienced statistical programmers share how to leverage the SAS Macro Language, Output Delivery System (ODS), the REPORT procedure, and ODS Statistical Graphics to blend both tabular and graphical elements. The result is beautiful, highly-customized, information-rich patient profiles that meet the requirements for managing a modern clinical trial.


PL-081 : The Rare Disease Clinical Research Network (RDCRN) works to advance medical research on rare diseases by providing support for clinical studies and facilitating collaboration, study enrollment and data sharing.
Kelly Olano, Cincinnati Children's Hospital Medical Center
Pierce Kuhnell, Cincinnati Children's Hospital Medical Center
Laurie Smith, Cincinnati Children's Hospital Medical Center

Rare diseases, defined in the United States as conditions affecting fewer than 200,000 individuals, collectively impact approximately 30 million Americans and 350 million people worldwide. Despite their prevalence, research in this domain faces significant challenges, including limited patient populations, underdiagnosis, insufficient funding, and regulatory hurdles. The Rare Disease Clinical Research Network (RDCRN), established under the Rare Diseases Act and funded by the NIH, addresses these challenges by fostering collaboration across 20 research teams conducting over 125 active studies on more than 200 rare diseases. The Data Management Coordinating Center (DMCC) at Cincinnati Children's Hospital plays a pivotal role in standardizing data collection, ensuring compliance, and supporting analysis and reporting. Key initiatives include migrating legacy data into REDCap and developing tools for data quality and regulatory compliance. Current applications focus on leveraging SAS Viya for providing standardized and custom reports, and analysis datasets with continuing advancements for automated reporting and interactive dashboards, enhancing data accessibility and usability. These efforts aim to accelerate research, improve patient outcomes, and strengthen the infrastructure for rare disease studies.


Posters

PO-011 : ExCITE-ing! Build Your Paper's Reference Section Programmatically Using Lex Jansen's Website and SAS
Louise Hadden, Cormac Corporation

One challenge in writing an SAS White Paper is creating the perfect reference section, properly acknowledging those who have inspired and paved the way. Luckily, clever use of such tools as Lex Jansen's website, SAS's ability to read in and manipulate varied data sources, and Microsoft Word citation manager, every author can succeed in proper referencing in their white papers. This paper and ePoster will demonstrate how to accomplish this goal.


PO-012 : The World is Not Enough: Base SAS Visualizations and Geolocations
Louise Hadden, Cormac Corporation

Geographic processing in SAS has recently undergone some major changes: as of Version 9.4 Maintenance Release M5 many procedures formerly a part of SAS/Graph are now available in BASE SAS. At the same time, SAS Graphics have added some new procedures such as PROC SGMAP in BASE SAS that build on the functionality of SAS/GRAPH's PROC GMAP and incorporate ODS graphics techniques including attribute maps and image annotation. This paper and poster will replicate a map of the world created by the author with SAS/GRAPH and PROC GMAP with the annotate facility using PROC SGMAP to map three different metrics on a map of the world. New SAS mapping and SG procedure techniques will be demonstrated, following Agent 007's adventures across the globe.


PO-036 : Who's Bringing That Big Data Energy? A 48-Year Longitudinal Analysis of 30,000 Presentations in the SAS User Community To Elucidate Top Contributors and Rising Stars
Troy Hughes, Data Llama Analytics

This analysis examines presentations at SAS user group conferences between 1976 and 2023. It includes presentations referenced on www.LexJansen.com (aka "the LEX") during this timeframe, which are drawn from multiple conferences, including: SAS User Group International (SUGI, may she rest in peace), SAS Global Forum (SGF, may she be revived), SAS Explore, Western Users of SAS Software (WUSS), Midwest SAS Users Group (MWSUG), South Central SAS Users Group (SCSUG), Southeast SAS Users Group (SESUG), Northeast SAS Users Group (NESUG), Pacific Northwest SAS Users Group (PNWSUG), and Pharmaceutical Software Users Group (PharmaSUG). This analysis identifies top contributors, including authors who have presented most abundantly at specific conferences, as well as across all conferences. For example, the SAS superstars and most prolific presenters of all time are ranked and recognized for bringing that big data energy (BDE) Kirk Paul Lafler, Arthur L. Carpenter, Louise Hadden, Charlie Shipp, and Ronald J. Fehd! Rising stars, who may be new to the conference scene, yet are contributing significantly, are also identified. In addition to quantifying and extolling the contributions of these authors, this analysis aims to assist the leaders of future conferences in identifying key speakers to invite. Threats to data quality are also discussed the reality of third-party data, as conferences often feed the LEX with raw presentation metadata that either are unstandardized or have patently wrong information (like incorrect author names or missing coauthors). Finally, and perhaps with some irony, Python 3.11.5 (and Pandas 2.1.0) was used exclusively to extract, clean, transform, and analyze all data.


PO-060 : Market Making Control Problem with Inventory Risk
Jimin Lee, Washington University in St. Louis

A high-frequency market maker reaps profit from the bid-ask spread, rapidly transacting their buy and sell limit orders. The optimal control problem concerns the placement of buy and sell limit order quotes by the market maker to maximize their total profit. It also concerns minimizing the market maker's inventory of shares, which incurs high liquidation cost at the end of the day. There have been various proposed solutions, with some taking place in discrete time and others in continuous time. Most solutions, however, assume that limit orders have size of 1, when in real life, this is not the case. In this project, we aim to incorporate the size of limit orders placed by market makers in solving the optimal control problem and implement it in python. We will calibrate the model parameters and test the performance of the proposed market making strategy against real-world data.


Visualization and Reporting

VR-005 : Dashboards Made Easy Using SAS Software
Kirk Paul Lafler, SasNerd

Organizations around the world develop business intelligence and analytics dashboards, sometimes referred to as enterprise dashboards, to display the status of "point-in-time" metrics and key performance indicators. Effectively designed dashboards extract real-time data from multiple sources for the purpose of highlighting important information, numbers, tables, statistics, metrics, performance scorecards and other essential content. This paper explores essential rules for "good" dashboard design, the metrics frequently used in dashboards, and the use of best practice programming techniques in the design of quick and easy dashboards using SAS software. Learn essential programming techniques to create real-world dashboards using Base-SAS software including PROC SQL, macro, Output Delivery System (ODS), ODS HTML, ODS Excel, ODS Layout, ODS Statistical Graphics, PROC SGPLOT, PROC SGPIE, and other technologies.


VR-010 : The (ODS) Output of Your Desires: Creating Designer Reports and Data Sets
Louise Hadden, Cormac Corporation

SAS procedures can convey an enormous amount of information sometimes more information than is needed. Most SAS procedures generate ODS objects behind the scenes. SAS uses these objects with style templates that have custom buckets for certain types of output to produce the output that we see in all destinations (including the SAS listing). By tracing output objects and ODS templates using ODS TRACE (DOM) and by manipulating procedural output and ODS OUTPUT objects, we can pick and choose just the information that we want to see. We can then harness the power of SAS data management and reporting procedures to coalesce the information collected and present the information accurately and attractively.


VR-034 : From Word Clouds to Phrase Clouds to Amaze Clouds: A Data-Driven Python Programming Solution To Building Configurable Taxonomies That Standardize, Categorize, and Visualize Phrase Frequency
Troy Hughes, Data Llama Analytics

Word clouds visualize the relative frequencies of words in some body of text, such as a website, white paper, blog, or book. They are useful in identifying contextual focus and keywords; however, word clouds as commonly defined and implemented suffer numerous limitations. First, multi-word phrases such as "data set" or "Base SAS" are unfortunately segmented into single words "data," "set," "Base," and "SAS." Second, desired capitalization often cannot be specified, such as visualizing "PROC PRINT" even when its lowercase "proc print" is observed in text or code. Third, spelling variations (e.g., single and plural nouns, various verb conjugations, abbreviations and acronyms) are not mapped to each other. Similarly, and fourth, comparable words or phrases (e.g., "PROC PRINT" and "PRINT procedure") are not mapped to each other, representing a further lack of entity resolution. This text and its Python Pandas solution seek to overcome data quality, data integrity, and data standardization issues that plague word clouds, by defining and applying configurable taxonomies data models that can impart more meaning and precision to ultimate word/phrase cloud visualizations. The result is a phrase cloud that amazes an amaze cloud!


VR-045 : Maps, maps, and more maps using SAS PROC SGMAP!
Melinda Macdougall, Cincinnati Children's Hospital Medical Center

SAS expanded the mapping capability with the introduction of PROC SGMAP in the SAS 9.4M5 release. This procedure expands upon the GMAP PROCEDURE and allows for more customizations. The data sets included in the MAPSGFK library provide detailed boundary location data points for locations across the world. These data sets provide easy access to plotting maps at various levels, including continents, countries, states, counties, and cities. These maps can be combined with publicly available data from sources such as data.census.gov to create many useful plots. In this presentation, I will walk through examples of maps at multiple levels: US, state (Ohio), metropolitan area (Greater Cincinnati), and county (Hamilton County, Ohio).


VR-048 : Wise Graphic Design & Color Use for Data Graphics Easily, Quickly, Correctly Understood
LeRoy Bessler, Bessler Consulting and Research

Let me take you straight past the defaults to The Best. Learn principles and methods to make data graphics that are easily, quickly, correctly understood, and how to avoid color decisions that obstruct graphic communication. See widely usable graphic designs that reveal what the data can tell your viewer or you. The ideas are software-independent, but are demonstrated with SAS ODS Graphics.