Los puntos clave no están disponibles para este artículo en este momento.
This study aimed to develop different models to forecast the daily number of patients seeking emergency department (ED) care in a general hospital according to calendar variables and ambient temperature readings and to compare the models in terms of forecasting accuracy. The authors developed and tested six different models of ED patient visits using total daily counts of patient visits to an ED in Sao Paulo, Brazil, from January 1, 2008, to December 31, 2010. The first 33 months of the data set were used to develop the ED patient visits forecasting models (the training set), leaving the last 3 months to measure each model's forecasting accuracy by the mean absolute percentage error (MAPE). Forecasting models were developed using three different time-series analysis methods: generalized linear models (GLM), generalized estimating equations (GEE), and seasonal autoregressive integrated moving average (SARIMA). For each method, models were explored with and without the effect of mean daily temperature as a predictive variable. The daily mean number of ED visits was 389, ranging from 166 to 613. Data showed a weekly seasonal distribution, with highest patient volumes on Mondays and lowest patient volumes on weekends. There was little variation in daily visits by month. GLM and GEE models showed better forecasting accuracy than SARIMA models. For instance, the MAPEs from GLM models and GEE models at the first month of forecasting (October 2012) were 11.5 and 10.8% (models with and without control for the temperature effect, respectively), while the MAPEs from SARIMA models were 12.8 and 11.7%. For all models, controlling for the effect of temperature resulted in worse or similar forecasting ability than models with calendar variables alone, and forecasting accuracy was better for the short-term horizon (7 days in advance) than for the longer term (30 days in advance). This study indicates that time-series models can be developed to provide forecasts of daily ED patient visits, and forecasting ability was dependent on the type of model employed and the length of the time horizon being predicted. In this setting, GLM and GEE models showed better accuracy than SARIMA models. Including information about ambient temperature in the models did not improve forecasting accuracy. Forecasting models based on calendar variables alone did in general detect patterns of daily variability in ED volume and thus could be used for developing an automated system for better planning of personnel resources. Este estudio tiene el objetivo de desarrollar diferentes modelos para predecir el número diario de pacientes que buscan atención en el servicio de urgencias (SU) en un hospital general según las variables del calendario y las lecturas de la temperatura ambiente, así como comparar los modelos en términos de certeza predictiva. Se desarrollaron y comprobaron seis modelos diferentes de visitas de pacientes al SU mediante la suma diaria total de las visitas de los pacientes a un SU de Sao Paulo, Brasil, desde el 1 de enero de 2008 al 31 de diciembre de 2010. Los datos de los primeros 33 meses se usaron para desarrollar los modelos de predicción de visitas de pacientes al SU (el entrenamiento del modelo), y se dejaron los últimos tres meses para medir la certeza predictiva de cada modelo por el error porcentual absoluto medio. Los modelos pronósticos se desarrollaron mediante tres métodos de análisis de series temporales diferentes: modelos lineales generalizados, ecuaciones de estimación generalizadas, y modelos autorregresivos integrados de media móvil estacional (SARIMA). Para cada método, los modelos se exploraron con y sin el efecto de la media de temperatura diaria como una variable predictiva. La media del número de visitas diarias al SU fue de 389, con un rango de 166 a 613. Los datos mostraron una distribución estacional semanal, con los volúmenes de pacientes más altos los lunes y los más bajos en fines de semana. Hubo poca variación en las visitas diarias por mes. Los modelos lineales generalizados y los modelos de ecuación de estimación generalizada mostraron mejor certeza predictiva que los modelos SARIMA. Por ejemplo, el error porcentual absoluto medio de los modelos lineales generalizados y de los modelos de ecuaciones de estimación generalizadas en el primer mes de predicción (octubre, 2012), fue de 11,5% y 10,8% (modelos con y sin control para el efecto de la temperatura, respectivamente) mientas que los errores porcentuales absolutos medios de los modelos SARIMA fueron de 12,8% y 11,7% (respectivamente). Para todos los modelos, el control por el efecto de la temperatura resultó en una capacidad predictiva similar o peor que los modelos con sólo las variables del calendario, y la certeza predictiva fue mejor para el corto plazo (anterior a 7 días) que para el más largo plazo (anterior a 30 días). Este estudio indica que los modelos de series temporales pueden ser desarrollados para proporcionar predicciones de las visitas diarias de los pacientes al SU. La capacidad predictiva fue dependiente del tipo de modelo empleado y la duración del tiempo predicho. En este sentido, los modelos lineales generalizados y los modelos de ecuación estimativa generalizada mostraron mejor certeza. La inclusión de la información sobre la temperatura ambiente en los modelos no mejoró la certeza pronostica. Los modelos de predictivos basados únicamente en las variables del calendario en general detectaron patrones de variabilidad diaria en el volumen del SU, y por ello podrían ser usados para desarrollar un sistema automatizado para la mejor planificación de los recursos de personal. Reports from different countries, including the United States, the United Kingdom, and Brazil, have shown an increase in demand for emergency department (ED) care, resulting in frequently overcrowded EDs, lengthy waiting times for assistance, and an overall perception by patients of poor health care.1-4 Prolonged waiting times are described as a major factor for dissatisfaction with ED care,5, 6 and patients are more likely to leave without being seen as waiting time increases.5 While common practice is to divert patients from EDs in times of overcrowding,2 using data on daily patient volume for better planning of personnel resources might increase ED efficiency, as well as improve ED patient care quality.5, 7-9 A report from the National Audit Office on inpatient admissions from acute hospitals in the United Kingdom has stated that the trusts could make more effective use of their knowledge of patterns of ED admissions to assess the likely demand for their resources.10 A time series is a set of chronologically ordered observations, and forecasting methods use past values of any given time series to predict its future behavior.11 Time-series models can be used to forecast future ED patient visits based on the estimated effect of predictor variables, and such forecasts can be used for proactive bed and staff management and for facilitating patient flow.12, 13 For example, the finding that Sundays had a much higher volume of patients in the ED at a hospital in Israel led to the decision of allocating an additional physician to staff every Sunday, thus alleviating ED congestion.9 Batal et al.5 have reported an 18.5% decrease in patients leaving without being seen in an ED and a 30% decrease in complaints after adjusting staff in accordance to the results of an applied ED patient visits forecasting model. Although decisions on staffing are commonly based on personal experience,14-16 a rational approach to allocation of resources would be of great importance for improving the quality of care delivered in EDs.1-3, 5, 17 A number of factors can influence daily ED visits, and a patient's visits forecasting model should include those factors. Previous studies have shown that ED visits present cyclical variations according to calendar variables, such as day of the week, time of the year, and the occurrence of public holidays.1, 5, 9, 12, 13, 17 Temperature variables have also been included in some patient visits forecasting models,14, 17-20 because many studies have demonstrated the association of climate factors, temperature in particular, with the occurrence of mortality and morbidity outcomes.21-24 For instance, weather forecasts in the United Kingdom have been used for warning chronic obstructive pulmonary disease patients when their health is likely to be affected.25 The predictive effect of temperature on daily ED visits, however, is still uncertain. While some studies have shown there is an association between these variables,12, 17-20, 26 other authors advocate that including temperature adds uncertainty to the model in exchange for little improvement on forecasting accuracy.5, 12, 27 Because the temperature effect depends on the geographical location and on characteristics of the ED,26 the relevance of including weather variables for improving the overall prediction accuracy should be tested when developing a particular forecasting model for daily ED visits. Our study aims to develop models to forecast the daily number of patients seeking ED care in a busy general hospital in a major world city (Sao Paulo, Brazil) according to calendar variables and ambient temperature. Different time-series approaches can be employed to develop forecast models, and the relevant literature indicates that there is no obvious supremacy of one method over others.11 We thus explored three different analytic approaches to develop daily ED patient visits forecasting models as well as the contribution of ambient temperature and compared the models in terms of forecasting accuracy. To develop and compare accuracy of forecast models of ED patient visits using different time-series analysis methods, we evaluated records of daily ED visits to a tertiary hospital. The study was approved by the Ethical Committee Review Board of the University of Sao Paulo Clinics Hospital. The study was conducted in Sao Paulo, Brazil, a city of approximately 11 million people. The ED is the main referral hospital for high-complexity emergency clinical, surgery, and trauma cases occurring in the south and west region of the city. Operating 7 days per week, 24 hours a day, the ED treats approximately 180,000 patients per year. Data on daily ED patient visits, including date and time of arrival and main diagnosis, were extracted from a computerized tracking system at the hospital's information and health department. We extracted total daily counts of all patients who presented to the ED from January 1, 2008, to December 31, 2010. Daily mean temperature data for the study period were obtained from the Sao Paulo Environmental Agency. These environmental data are collected hourly at 12 fully automated monitoring stations throughout Sao Paulo. Mean temperature was calculated as an average of all 24-hour measurements at the 12 stations. The database of daily ED visits and ambient temperature was divided into two periods. The first period, from January 1, 2008, to September 30, 2010, was used for initial data analysis and model development (the “training set”). The second period, from October 1, 2010, to December 31, 2010, was used to apply the ED patient visits forecasting models and test their accuracy (the “postsample forecasting set”). The postsample forecasting set was further divided into three forecasting horizons of 1 month each (October, November, and December 2010), and forecast accuracy was assessed at horizons of 7 and 30 days in advance. After forecasting daily ED visits for the first horizon (October) and measuring the model's accuracy, the observed values of ED visits were incorporated into the training set and the model reestimated, with the resulting outputs being used to forecast ED visits for the second horizon (November). The process was then repeated for the third horizon (December). This 30-day horizon approach was chosen as we wanted to simulate a real case scenario in which the forecasting model could be updated with the observed values as time went by, and new forecasting values would be generated for future dates. We explored models of ED patient visits using three forecasting methods: generalized linear models (GLM), generalized estimating equations (GEE), and seasonal autoregressive integrated moving average (SARIMA) models. All analyses were conducted in Stata 12.0 (StataCorp, College Station, TX). Generalized linear models have been used widely in time-series regression studies of health outcomes in relation to environmental variables.21-24, 28 GEEs are an extension of GLMs that have been increasingly used for time-series analysis as well29-32 and provide the advantage of allowing for autocorrelation (nonindependence of ED patient visits on proximate days) to be taken into account in the postsample forecasting set.29, 30 We thus applied Poisson GLM and GEE models, allowing for overdispersion to quantify effects of the predictor variables on daily ED visits and to forecast the number of ED visits in the postsample forecasting set. For the GEE models, we considered autoregressive structures up to a 7-day lag, finally choosing an autoregressive structure of 1-day lag based on best model fit. An autoregressive time-series model is a multiple regression model in which the outcome variable is regressed on its past values,11 and the chosen 1-day lag means that the number of ED visits in one given day is mostly affected by the previous day's patient volume. Goodness of fit was assessed through comparison of quasi-likelihood under the independence model criterion (QIC).33 The third forecasting method we examined was a SARIMA model. Autoregressive integrated moving average models of variables in terms of their past values and has been described as the widely used in health 12, 17 SARIMA models models and for the of seasonal In time-series to any that with a such as the weekly observed in ED daily visits. A SARIMA model is by the of is the of data through of at a lag to make a series and is the of the moving average and are their seasonal and is the seasonal was by of the autocorrelation and autocorrelation of the These that a SARIMA model of had the best fit for an autoregressive structure of a 1-day lag, a moving average of a lag, and a weekly 7-day Because there is some in the of such we compared models based on information criterion and the model described was the model with the best fit. All ED patient visits forecasting models included calendar variables as of ED visits. These calendar variables were public total of 12 days per and the days and after a because ED on such days can be affected by a effect of the Because the of an in ED visits in Sao Paulo to an term for that period 13 to 12, was included in the models. because of visits for the 31 of December and January first of each were also a term for these was also included in the models. was for in all models by including a data that for controlling of cyclical patterns in the terms were chosen because their not on the data and thus can be used in the postsample forecasting was for by means of a linear term for date of For each of the three forecasting methods, we tested one model with and one without temperature as a predictor of daily ED visits. Temperature has been shown to be the climate factor to health 26 Because health outcomes are with of and temperature effect was as a linear increase and values were by models over all observed values in the temperature and then values with best model fit For of the effect of each variable on the daily number of ED patient visits was as the percentage in daily patient volume. This measure of association is a of the obtained from GLM and GEE models and the increase or decrease in the number of daily ED visits with each variable in relation to the SARIMA models not provide information on the overall effect of each and thus was not used for effect accuracy was through comparison of the and observed values of daily ED visits and through of the mean absolute percentage error in each horizon (7 and 30 days in advance) of the postsample forecasting set. is the mean of the absolute between and observed values in terms of a percentage of the observed thus a better forecasting accuracy. a can be used to compare forecasting results of different time-series models and other We observed ED patient visits the training set period The daily mean number of ED visits was 389, ranging from 166 visits to visits. the period, daily mean ambient temperature was was was The of ED patient volume according to date of ED the study period an in visits over more from January The also the different patterns of patient and patient ED daily visits as by the two on the throughout the study of the of data by and month showed higher patient volumes on Mondays and volumes on weekends. there was little variation in daily visits by month. 1 the estimated effect of each predictor variable on daily ED visits obtained from the GLMs with and without terms for temperature and in terms of percentage in daily volume. values for and effects were and were up to and Although temperature was with daily ED visits, with the effect being than the effect, controlling for temperature did not the estimated effect of the calendar The effects obtained through with GEE were similar and are 3 the observed and values of daily ED visits in the postsample forecasting set for the GLM and SARIMA models. GLM and GEE results were the was for better The that values the observed as the each of those values a similar and there is an of values in the all tested models could forecast major in patient the times at which and in observed values were in with in of including and without including temperature did not We calculated the of each model in to compare their accuracy and which would be the in the calculated MAPEs of all tested models in the three postsample forecasting MAPEs are also shown for the first 7 days of each In MAPEs for the 7-day horizons were than for the 30-day MAPEs from first to third MAPEs from GLM and GEE were similar and better results than SARIMA. for the effect of temperature did not improve ED patient forecasting accuracy. The calculated MAPEs from models including temperature values resulted in worse or similar forecasting This study assessed different methods for forecasting daily ED visits and compared the accuracy of models with and without of ambient temperature We that calendar variables were more forecasting factors than ambient temperature. results showed that weekly was more than on daily ED patient variation throughout the study period, and Mondays presented the highest ED patient volumes while presented the These results are in accordance with previous 9, 12, 17 Our models could predict patterns of daily ED visits. GLM and GEE models similar and showed better forecasting accuracy than SARIMA models. models have been used for health forecasting 12, 17 previous studies that other time-series methods to or better than 12 and compared different methods for forecasting hourly ED visits, including and that a moving average model had the best forecasting et compared multiple linear time-series and models accuracy in forecasting daily ED volume at three in the United and that time-series regression had the best and accuracy of all tested models. their better accuracy in we that for GLM and GEE are over a SARIMA model is an process and the autoregressive and moving average structures to be as updated data are included in the training to use as an automated control in previous studies was mostly by the of variables time terms for each month of the year. This approach is as to control for seasonal a and a in at the of each We for seasonal patterns using which model seasonal of different using this different the overall MAPEs obtained from models were in accordance with MAPEs in previous and daily ED visits with models and a of et results that from to et used a SARIMA model including terms for weather and to forecast daily ED visits at three health and MAPEs of and et daily ED visits in a hospital using a and a SARIMA model including weather variables, with resulting MAPEs of and et daily ED in two hospitals in and obtained a of to the the GLM and GEE were and models were better for the 7-day than the 30-day MAPEs at the third horizon were than at the second and first (October This could be to the that the third horizon included a of data in the training set being thus resulting in more forecasting of ED patient visits. for such a is that the third horizon be a more as December to and a of leave the city for and In all 3 of the data December had the lowest number of observed ED visits. Our not including terms for ambient temperature effects in ED patient forecasting models. models based on calendar variables, models that did not include temperature variables for forecasting ED daily visits, to or better than the more not improving ED demand forecast accuracy, weather of be forecast with accuracy, for horizons of more than 3 days in thus further to an ED patient forecast model. ED patient forecasting models based on calendar variables are more set up as an automated process and can be well in leaving time for Although be that forecasting models of ED patient visits including terms for temperature effects would at as well as those that did not include such results the are not that forecasting models in general a better fit than models this was for the resulting forecast is not more and models are The of this study was to models for ED patient volume that health care can for better planning and To such models should be tested in Although models could forecast the of daily ED visits, there were some days that were with absolute which could use of the forecasting method in the other previous studies with similar accuracy to models reported results when used for For instance, Batal et al.5 an 18.5% decrease in patients leaving without being seen and a 30% decrease in after their model to staffing The data set included months of the in (October through we could not the model would the the database to forecast daily ED visits the or other to assess any in the forecast accuracy could be of great Our models included those predictive variables that could be in setting, there be other factors daily ED visits that could not be evaluated in this such as the of other care and their environmental factors not considered also have the to make a contribution to model Our study indicates that time-series models can be developed to provide forecasts of ED patient visits, which might future ED staffing Forecasting ability was dependent on the type of model employed and the length of the time horizon being predicted. In setting, generalized linear models and generalized estimating models showed better accuracy than seasonal autoregressive integrated moving average models, and including information about ambient temperature did not improve forecasting accuracy. Although there were days with forecast forecasting models based on calendar variables alone did in general detect patterns of daily variability in ED and thus could be used for developing an automated system for better planning of personnel resources.
Marcílio et al. (Thu,) studied this question.