| Title: | Access Brazilian Public Health Data |
|---|---|
| Description: | Provides easy access to Brazilian public health data from multiple sources including VIGITEL (Surveillance of Risk Factors for Chronic Diseases by Telephone Survey), PNS (National Health Survey), 'PNAD' Continua (Continuous National Household Sample Survey), 'POF' (Household Budget Survey with food security and consumption data), 'Censo Demografico' (population denominators via 'SIDRA' API), SIM (Mortality Information System), SINASC (Live Birth Information System), 'SIH' (Hospital Information System), 'SIA' (Outpatient Information System), 'SINAN' (Notifiable Diseases Surveillance), 'CNES' (National Health Facility Registry), 'SI-PNI' (National Immunization Program - aggregated 1994-2019 via FTP, individual-level 'microdata' 2020+ via 'OpenDataSUS' API), 'SISAB' (Primary Care Health Information System - coverage indicators via REST API), ANS ('Agencia Nacional de Saude Suplementar' - supplementary health beneficiaries, consumer complaints, and financial statements), 'ANVISA' ('Agencia Nacional de Vigilancia Sanitaria' - product registrations, 'pharmacovigilance', 'hemovigilance', 'technovigilance', and controlled substance sales via 'SNGPC'), and other health information systems. Data is downloaded from the Brazilian Ministry of Health and 'IBGE' repositories. Data is returned in tidy format following tidyverse conventions. |
| Authors: | Sidney Bissoli [aut, cre] (ORCID: <https://orcid.org/0009-0001-0442-3700>) |
| Maintainer: | Sidney Bissoli <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.0 |
| Built: | 2026-05-22 09:22:15 UTC |
| Source: | https://github.com/sidneybissoli/healthbr |
Shows information about cached ANS data files.
ans_cache_status(cache_dir = NULL)ans_cache_status(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
A tibble with cache file information (invisibly).
Other ans:
ans_clear_cache(),
ans_data(),
ans_info(),
ans_operators(),
ans_variables(),
ans_years()
ans_cache_status()ans_cache_status()
Deletes cached ANS data files.
ans_clear_cache(cache_dir = NULL)ans_clear_cache(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
Invisible NULL.
Other ans:
ans_cache_status(),
ans_data(),
ans_info(),
ans_operators(),
ans_variables(),
ans_years()
ans_clear_cache()ans_clear_cache()
Downloads and returns data from the ANS (Agencia Nacional de Saude Suplementar) open data portal. Supports three data types: beneficiary counts, consumer complaints (NIP), and financial statements.
ans_data( year, type = "beneficiaries", uf = NULL, month = NULL, quarter = NULL, vars = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )ans_data( year, type = "beneficiaries", uf = NULL, month = NULL, quarter = NULL, vars = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )
year |
Integer. Year(s) of the data. Required. |
type |
Character. Type of data. One of:
|
uf |
Character. Two-letter state abbreviation(s). Only used for
|
month |
Integer. Month(s) 1-12. Only used for
|
quarter |
Integer. Quarter(s) 1-4. Only used for
|
vars |
Character vector. Variables to keep. If NULL (default),
returns all available variables. Use |
cache |
Logical. If TRUE (default), caches downloaded data for faster future access. |
cache_dir |
Character. Directory for caching. Default:
|
lazy |
Logical. If TRUE, returns a lazy query object instead of a tibble. Requires the arrow package. Default: FALSE. |
backend |
Character. Backend for lazy evaluation: |
Data is downloaded from the ANS open data portal at
https://dadosabertos.ans.gov.br/.
Beneficiaries: Monthly per-state ZIP files containing CSV data with consolidated beneficiary counts by operator, plan type, sex, age group, and municipality. Available from April 2019.
Complaints: Annual national CSV files with consumer complaints filed through the NIP (Notificacao de Intermediacao Preliminar). Available from 2011.
Financial: Quarterly ZIP files with financial statements of health plan operators (balance sheets, income statements). Available from 2007.
When downloading multiple files (e.g., several months or quarters), install
furrr and future and set a parallel plan to speed up downloads:
future::plan(future::multisession, workers = 4). See
vignette("healthbR") for details.
A tibble with ANS data. Includes partition columns:
year (all types), month and uf_source
(beneficiaries), quarter (financial).
ans_operators() for the operator registry,
ans_variables() for variable descriptions.
Other ans:
ans_cache_status(),
ans_clear_cache(),
ans_info(),
ans_operators(),
ans_variables(),
ans_years()
# beneficiary counts for Acre, December 2023 ac <- ans_data(year = 2023, month = 12, uf = "AC") # consumer complaints for 2022 nip <- ans_data(year = 2022, type = "complaints") # financial statements Q1 2023 fin <- ans_data(year = 2023, type = "financial", quarter = 1)# beneficiary counts for Acre, December 2023 ac <- ans_data(year = 2023, month = 12, uf = "AC") # consumer complaints for 2022 nip <- ans_data(year = 2022, type = "complaints") # financial statements Q1 2023 fin <- ans_data(year = 2023, type = "financial", quarter = 1)
Displays information about the ANS (Agencia Nacional de Saude Suplementar) module, including data sources, available years, and usage guidance.
ans_info()ans_info()
A list with module information (invisibly).
Other ans:
ans_cache_status(),
ans_clear_cache(),
ans_data(),
ans_operators(),
ans_variables(),
ans_years()
ans_info()ans_info()
Downloads and returns the current registry of health plan operators from the ANS open data portal. This is a snapshot of the current operator status (not time-series data).
ans_operators(status = "active", vars = NULL, cache = TRUE, cache_dir = NULL)ans_operators(status = "active", vars = NULL, cache = TRUE, cache_dir = NULL)
status |
Character. Filter by operator status:
|
vars |
Character vector. Variables to keep. If NULL (default),
returns all 20 variables. Use |
cache |
Logical. If TRUE (default), caches downloaded data. |
cache_dir |
Character. Directory for caching. |
A tibble with operator data. When status = "all",
includes a status column indicating "active" or "cancelled".
Other ans:
ans_cache_status(),
ans_clear_cache(),
ans_data(),
ans_info(),
ans_variables(),
ans_years()
# active operators ops <- ans_operators() # all operators (active + cancelled) all_ops <- ans_operators(status = "all")# active operators ops <- ans_operators() # all operators (active + cancelled) all_ops <- ans_operators(status = "all")
Returns a tibble with available variables in the ANS data, including descriptions and value types.
ans_variables(type = "beneficiaries", search = NULL)ans_variables(type = "beneficiaries", search = NULL)
type |
Character. Type of data. One of |
search |
Character. Optional search term to filter variables by name or description. Case-insensitive and accent-insensitive. |
A tibble with columns: variable, description, type, section.
Other ans:
ans_cache_status(),
ans_clear_cache(),
ans_data(),
ans_info(),
ans_operators(),
ans_years()
ans_variables() ans_variables(type = "complaints") ans_variables(search = "operadora")ans_variables() ans_variables(type = "complaints") ans_variables(search = "operadora")
Returns an integer vector with years for which ANS data are available.
ans_years(type = "beneficiaries")ans_years(type = "beneficiaries")
type |
Character. Type of data. One of:
|
An integer vector of available years.
Other ans:
ans_cache_status(),
ans_clear_cache(),
ans_data(),
ans_info(),
ans_operators(),
ans_variables()
ans_years() ans_years(type = "complaints") ans_years(type = "financial")ans_years() ans_years(type = "complaints") ans_years(type = "financial")
Shows information about cached ANVISA data files.
anvisa_cache_status(cache_dir = NULL)anvisa_cache_status(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
A tibble with cache file information (invisibly).
Other anvisa:
anvisa_clear_cache(),
anvisa_data(),
anvisa_info(),
anvisa_types(),
anvisa_variables()
anvisa_cache_status()anvisa_cache_status()
Deletes cached ANVISA data files.
anvisa_clear_cache(cache_dir = NULL)anvisa_clear_cache(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
Invisible NULL.
Other anvisa:
anvisa_cache_status(),
anvisa_data(),
anvisa_info(),
anvisa_types(),
anvisa_variables()
anvisa_clear_cache()anvisa_clear_cache()
Downloads and returns data from the ANVISA (Agencia Nacional de Vigilancia Sanitaria) open data portal. Supports 14 data types across 4 categories: product registrations, reference tables, post-market surveillance, and controlled substance sales (SNGPC).
anvisa_data( type = "medicines", year = NULL, month = NULL, vars = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )anvisa_data( type = "medicines", year = NULL, month = NULL, vars = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )
type |
Character. Type of data to download. Default: Snapshot types (no year/month needed):
Time-series types (year required):
|
year |
Integer. Year(s) of the data. Only used for SNGPC types (2014-2026). Ignored with a warning for snapshot types. |
month |
Integer. Month(s) 1-12. Only used for SNGPC types. If NULL (default), downloads all 12 months. Ignored with a warning for snapshot types. |
vars |
Character vector. Variables to keep. If NULL (default),
returns all available variables. Use |
cache |
Logical. If TRUE (default), caches downloaded data for faster future access. |
cache_dir |
Character. Directory for caching. Default:
|
lazy |
Logical. If TRUE, returns a lazy query object instead of a tibble. Only available for SNGPC types (partitioned cache). Requires the arrow package. Default: FALSE. |
backend |
Character. Backend for lazy evaluation: |
Data is downloaded from the ANVISA open data portal at
https://dados.anvisa.gov.br/dados/.
Snapshot types: Download a single CSV file representing the current state of the registry/database. No time dimension. Cached as flat files.
SNGPC types: Monthly CSV files with controlled substance sales data. Data available from January 2014 to October 2021, with new data from January 2026. Cached as Hive-style partitioned parquet datasets.
The three VigiMed types share the IDENTIFICACAO_NOTIFICACAO key
for linking notifications, medicines, and reactions.
When downloading multiple SNGPC months, install furrr and future
and set a parallel plan to speed up downloads:
future::plan(future::multisession, workers = 4). See
vignette("healthbR") for details.
A tibble with ANVISA data. SNGPC types include year and
month partition columns.
anvisa_types() for available types,
anvisa_variables() for variable descriptions.
Other anvisa:
anvisa_cache_status(),
anvisa_clear_cache(),
anvisa_info(),
anvisa_types(),
anvisa_variables()
# registered medicines med <- anvisa_data(type = "medicines") # hemovigilance notifications hemo <- anvisa_data(type = "hemovigilance") # SNGPC controlled substance sales, Jan 2020 sngpc <- anvisa_data(type = "sngpc", year = 2020, month = 1)# registered medicines med <- anvisa_data(type = "medicines") # hemovigilance notifications hemo <- anvisa_data(type = "hemovigilance") # SNGPC controlled substance sales, Jan 2020 sngpc <- anvisa_data(type = "sngpc", year = 2020, month = 1)
Displays information about the ANVISA (Agencia Nacional de Vigilancia Sanitaria) module, including data sources, available types, and usage guidance.
anvisa_info()anvisa_info()
A list with module information (invisibly).
Other anvisa:
anvisa_cache_status(),
anvisa_clear_cache(),
anvisa_data(),
anvisa_types(),
anvisa_variables()
anvisa_info()anvisa_info()
Returns a tibble with available ANVISA data types, their names, descriptions, and categories.
anvisa_types()anvisa_types()
A tibble with columns: code, name, description, category.
Other anvisa:
anvisa_cache_status(),
anvisa_clear_cache(),
anvisa_data(),
anvisa_info(),
anvisa_variables()
anvisa_types()anvisa_types()
Returns a tibble with available variables for a given ANVISA data type, including descriptions.
anvisa_variables(type = "medicines", search = NULL)anvisa_variables(type = "medicines", search = NULL)
type |
Character. ANVISA data type code. Default: |
search |
Character. Optional search term to filter variables by name or description. Case-insensitive and accent-insensitive. |
A tibble with columns: variable, description.
Other anvisa:
anvisa_cache_status(),
anvisa_clear_cache(),
anvisa_data(),
anvisa_info(),
anvisa_types()
anvisa_variables() anvisa_variables(type = "hemovigilance") anvisa_variables(search = "registro")anvisa_variables() anvisa_variables(type = "hemovigilance") anvisa_variables(search = "registro")
Retrieves population estimates for intercensitary years (2001-2021) from SIDRA table 6579. These estimates provide population denominators for years between censuses.
censo_estimativa( year, territorial_level = "state", geo_code = "all", raw = FALSE )censo_estimativa( year, territorial_level = "state", geo_code = "all", raw = FALSE )
year |
Numeric or vector. Year(s) between 2001 and 2021. |
territorial_level |
Character. Geographic level:
|
geo_code |
Character. IBGE code(s) for specific localities.
|
raw |
Logical. If TRUE, returns raw API output without cleaning. Default is FALSE. |
Table 6579 provides total population estimates (no sex/age breakdown). These estimates are published annually by IBGE and are widely used as denominators for health indicator calculations.
For census years with full demographic breakdowns, use
censo_populacao instead.
A tibble with population estimates.
Data is retrieved from IBGE SIDRA API, table 6579:
https://sidra.ibge.gov.br/tabela/6579
# estimates for 2020 by state censo_estimativa(year = 2020) # estimates for multiple years, Brazil level censo_estimativa(year = 2015:2020, territorial_level = "brazil") # estimates by municipality censo_estimativa(year = 2021, territorial_level = "municipality")# estimates for 2020 by state censo_estimativa(year = 2020) # estimates for multiple years, Brazil level censo_estimativa(year = 2015:2020, territorial_level = "brazil") # estimates by municipality censo_estimativa(year = 2021, territorial_level = "municipality")
Displays information about the Brazilian Census and returns metadata.
censo_info(year = NULL)censo_info(year = NULL)
year |
Numeric. Year to get specific information about. NULL shows general info. |
Invisibly returns a list with Census metadata.
censo_info() censo_info(2022)censo_info() censo_info(2022)
Retrieves population data from the Brazilian Demographic Census via SIDRA API. Automatically selects the correct SIDRA table based on year and requested variables.
censo_populacao( year, variables = "total", territorial_level = "state", geo_code = "all", raw = FALSE )censo_populacao( year, variables = "total", territorial_level = "state", geo_code = "all", raw = FALSE )
year |
Numeric. Census year (1970, 1980, 1991, 2000, 2010, or 2022). |
variables |
Character. Type of breakdown:
Default is |
territorial_level |
Character. Geographic level:
|
geo_code |
Character. IBGE code(s) for specific localities.
|
raw |
Logical. If TRUE, returns raw API output without cleaning. Default is FALSE. |
This function provides an easy interface for the most common Census queries. It automatically resolves the correct SIDRA table:
Table 200: Historical population 1970-2010 (by sex, age, situation)
Table 9514: Census 2022 population by sex and age
Table 136: Population by race 2000-2010
Table 9605: Population by race 2022
Table 9515: Population by urban/rural 2022
For more flexibility, use censo_sidra_data to query any table
with custom parameters.
A tibble with population data.
Data is retrieved from IBGE SIDRA API:
https://sidra.ibge.gov.br/
# total population by state, 2022 censo_populacao(year = 2022) # population by sex, Brazil level censo_populacao(year = 2022, variables = "sex", territorial_level = "brazil") # population by age and sex, 2010 censo_populacao(year = 2010, variables = "age_sex") # population by race, 2022 censo_populacao(year = 2022, variables = "race")# total population by state, 2022 censo_populacao(year = 2022) # population by sex, Brazil level censo_populacao(year = 2022, variables = "sex", territorial_level = "brazil") # population by age and sex, 2010 censo_populacao(year = 2010, variables = "age_sex") # population by race, 2022 censo_populacao(year = 2022, variables = "race")
Queries the IBGE SIDRA API to retrieve any Census table. This is the most flexible function, allowing full control over SIDRA query parameters.
censo_sidra_data( table, territorial_level = "brazil", geo_code = "all", year = NULL, variable = NULL, classifications = NULL, raw = FALSE )censo_sidra_data( table, territorial_level = "brazil", geo_code = "all", year = NULL, variable = NULL, classifications = NULL, raw = FALSE )
table |
Numeric or character. SIDRA table code.
Use |
territorial_level |
Character. Geographic level:
|
geo_code |
Character. IBGE code(s) for specific localities.
|
year |
Numeric or character. Year(s) to query. NULL returns all available periods. |
variable |
Numeric or character. SIDRA variable ID(s). NULL returns all variables excluding metadata. Default NULL. |
classifications |
Named list. SIDRA classification filters.
Example: |
raw |
Logical. If TRUE, returns raw API output without cleaning. Default FALSE. |
A tibble with queried data.
# population by state from 2022 Census censo_sidra_data( table = 9514, territorial_level = "state", year = 2022, variable = 93 ) # population by race, Brazil level censo_sidra_data( table = 9605, territorial_level = "brazil", year = 2022, variable = 93, classifications = list("86" = "allxt") )# population by state from 2022 Census censo_sidra_data( table = 9514, territorial_level = "state", year = 2022, variable = 93 ) # population by race, Brazil level censo_sidra_data( table = 9605, territorial_level = "brazil", year = 2022, variable = 93, classifications = list("86" = "allxt") )
Searches Census SIDRA tables by keyword in the table name. Supports partial matching, case-insensitive, and accent-insensitive search.
censo_sidra_search(keyword, year = NULL)censo_sidra_search(keyword, year = NULL)
keyword |
Character. Search term (minimum 2 characters). |
year |
Character or numeric. Filter tables containing data for this year. NULL returns all. |
A tibble with matching tables (same structure as
censo_sidra_tables).
censo_sidra_search("deficiencia") censo_sidra_search("raca") censo_sidra_search("indigena")censo_sidra_search("deficiencia") censo_sidra_search("raca") censo_sidra_search("indigena")
Returns a catalog of available SIDRA tables for the Census, organized by theme.
censo_sidra_tables(theme = NULL, year = NULL)censo_sidra_tables(theme = NULL, year = NULL)
theme |
Character. Filter by theme. NULL returns all themes.
Available themes: |
year |
Character or numeric. Filter tables that contain data for this year. NULL returns tables for all years. |
A tibble with columns: table_code, table_name, theme, years, territorial_levels.
# list all Census tables censo_sidra_tables() # filter by theme censo_sidra_tables(theme = "population") # tables with 2022 data censo_sidra_tables(year = 2022)# list all Census tables censo_sidra_tables() # filter by theme censo_sidra_tables(theme = "population") # tables with 2022 data censo_sidra_tables(year = 2022)
Returns a character vector with available Census years.
censo_years()censo_years()
A character vector of available years.
censo_years()censo_years()
Shows information about cached CNES data files.
cnes_cache_status(cache_dir = NULL)cnes_cache_status(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
A tibble with cache file information (invisibly).
Other cnes:
cnes_clear_cache(),
cnes_data(),
cnes_dictionary(),
cnes_info(),
cnes_variables(),
cnes_years()
cnes_cache_status()cnes_cache_status()
Deletes cached CNES data files.
cnes_clear_cache(cache_dir = NULL)cnes_clear_cache(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
Invisible NULL.
Other cnes:
cnes_cache_status(),
cnes_data(),
cnes_dictionary(),
cnes_info(),
cnes_variables(),
cnes_years()
cnes_clear_cache()cnes_clear_cache()
Downloads and returns health facility registry data from DATASUS FTP. Each row represents one health facility record (for the ST type). Data is organized monthly – one .dbc file per type, state (UF), and month.
cnes_data( year, type = "ST", month = NULL, vars = NULL, uf = NULL, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )cnes_data( year, type = "ST", month = NULL, vars = NULL, uf = NULL, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )
year |
Integer. Year(s) of the data. Required. |
type |
Character. File type to download. Default: |
month |
Integer. Month(s) of the data (1-12). If NULL (default),
downloads all 12 months. Example: |
vars |
Character vector. Variables to keep. If NULL (default),
returns all available variables. Use |
uf |
Character. Two-letter state abbreviation(s) to download.
If NULL (default), downloads all 27 states.
Example: |
parse |
Logical. If TRUE (default), converts columns to
appropriate types (integer, double, Date) based on the variable
metadata. Use |
col_types |
Named list. Override the default type for specific
columns. Names are column names, values are type strings:
|
cache |
Logical. If TRUE (default), caches downloaded data for faster future access. |
cache_dir |
Character. Directory for caching. Default:
|
lazy |
Logical. If TRUE, returns a lazy query object instead of a
tibble. Requires the arrow package. The lazy object supports
dplyr verbs (filter, select, mutate, etc.) which are pushed down
to the query engine before collecting into memory. Call
|
backend |
Character. Backend for lazy evaluation: |
Data is downloaded from DATASUS FTP as .dbc files (one per type/state/month). The .dbc format is decompressed internally using vendored C code from the blast library. No external dependencies are required.
CNES data is monthly, so downloading an entire year for all states requires
324 files (27 UFs x 12 months) per type. Use uf and month
to limit downloads.
The CNES has 13 file types. The default "ST" (establishments) is
the most commonly used. Use cnes_info() to see all types.
When downloading multiple files (e.g., several months or states), install
furrr and future and set a parallel plan to speed up downloads:
future::plan(future::multisession, workers = 4). See
vignette("healthbR") for details.
A tibble with health facility data. Includes columns
year, month, and uf_source to identify the source
when multiple years/months/states are combined.
cnes_info() for file type descriptions,
censo_populacao() for population denominators.
Other cnes:
cnes_cache_status(),
cnes_clear_cache(),
cnes_dictionary(),
cnes_info(),
cnes_variables(),
cnes_years()
# all establishments in Acre, January 2023 ac_jan <- cnes_data(year = 2023, month = 1, uf = "AC") # only key variables cnes_data(year = 2023, month = 1, uf = "AC", vars = c("CNES", "CODUFMUN", "TP_UNID", "VINC_SUS")) # hospital beds leitos <- cnes_data(year = 2023, month = 1, uf = "AC", type = "LT") # health professionals prof <- cnes_data(year = 2023, month = 1, uf = "AC", type = "PF")# all establishments in Acre, January 2023 ac_jan <- cnes_data(year = 2023, month = 1, uf = "AC") # only key variables cnes_data(year = 2023, month = 1, uf = "AC", vars = c("CNES", "CODUFMUN", "TP_UNID", "VINC_SUS")) # hospital beds leitos <- cnes_data(year = 2023, month = 1, uf = "AC", type = "LT") # health professionals prof <- cnes_data(year = 2023, month = 1, uf = "AC", type = "PF")
Returns a tibble with the complete data dictionary for the CNES, including variable descriptions and category labels.
cnes_dictionary(variable = NULL)cnes_dictionary(variable = NULL)
variable |
Character. If provided, returns dictionary for a specific variable only. Default: NULL (returns all variables). |
A tibble with columns: variable, description, code, label.
Other cnes:
cnes_cache_status(),
cnes_clear_cache(),
cnes_data(),
cnes_info(),
cnes_variables(),
cnes_years()
cnes_dictionary() cnes_dictionary("TP_UNID") cnes_dictionary("ESFERA_A")cnes_dictionary() cnes_dictionary("TP_UNID") cnes_dictionary("ESFERA_A")
Displays information about the National Health Facility Registry (CNES), including data sources, available years, file types, and usage guidance.
cnes_info()cnes_info()
A list with module information (invisibly).
Other cnes:
cnes_cache_status(),
cnes_clear_cache(),
cnes_data(),
cnes_dictionary(),
cnes_variables(),
cnes_years()
cnes_info()cnes_info()
Returns a tibble with available variables in the CNES data (ST type), including descriptions and value types.
cnes_variables(type = "ST", search = NULL)cnes_variables(type = "ST", search = NULL)
type |
Character. File type to show variables for. Currently only
|
search |
Character. Optional search term to filter variables by name or description. Case-insensitive and accent-insensitive. |
A tibble with columns: variable, description, type, section.
Other cnes:
cnes_cache_status(),
cnes_clear_cache(),
cnes_data(),
cnes_dictionary(),
cnes_info(),
cnes_years()
cnes_variables() cnes_variables(search = "tipo") cnes_variables(search = "gestao")cnes_variables() cnes_variables(search = "tipo") cnes_variables(search = "gestao")
Returns an integer vector with years for which health facility registry data are available from DATASUS FTP.
cnes_years(status = "final")cnes_years(status = "final")
status |
Character. Filter by data status. One of:
|
An integer vector of available years.
Other cnes:
cnes_cache_status(),
cnes_clear_cache(),
cnes_data(),
cnes_dictionary(),
cnes_info(),
cnes_variables()
cnes_years() cnes_years(status = "all")cnes_years() cnes_years(status = "all")
Returns information about all data sources available in healthbR.
list_sources()list_sources()
A tibble with columns:
source: Source code (e.g., "vigitel", "sim")
name: Full name of the data source
description: Brief description
years: Range of available years
status: Implementation status ("available", "planned")
list_sources()list_sources()
Shows cache status including downloaded files and their sizes.
pnadc_cache_status(cache_dir = NULL)pnadc_cache_status(cache_dir = NULL)
cache_dir |
Character. Optional custom cache directory. If NULL (default), uses the standard user cache directory. |
A tibble with cache information
pnadc_cache_status()pnadc_cache_status()
Removes all cached PNADC data files.
pnadc_clear_cache(module = NULL, cache_dir = NULL)pnadc_clear_cache(module = NULL, cache_dir = NULL)
module |
Character. Optional module to clear cache for. If NULL (default), clears cache for all modules. |
cache_dir |
Character. Optional custom cache directory. If NULL (default), uses the standard user cache directory. |
NULL (invisibly)
pnadc_clear_cache()pnadc_clear_cache()
Downloads and returns PNADC microdata for the specified module and year(s)
from the IBGE FTP. Data is cached locally to avoid repeated downloads.
When the arrow package is installed, data is cached in parquet format
for faster subsequent reads.
pnadc_data( module, year = NULL, vars = NULL, as_survey = FALSE, cache_dir = NULL, refresh = FALSE, lazy = FALSE, backend = c("arrow", "duckdb") )pnadc_data( module, year = NULL, vars = NULL, as_survey = FALSE, cache_dir = NULL, refresh = FALSE, lazy = FALSE, backend = c("arrow", "duckdb") )
module |
Character. The module identifier. Use |
year |
Numeric or vector. Year(s) to download. Use NULL for all available years for the module. Default is NULL. |
vars |
Character vector. Variables to select. Use NULL for all variables. Survey design variables (UPA, Estrato, V1028) and key demographic variables are always included. Default is NULL. |
as_survey |
Logical. If TRUE, returns a survey design object (requires
the |
cache_dir |
Character. Directory for caching downloaded files.
Default uses |
refresh |
Logical. If TRUE, re-download even if file exists in cache. Default is FALSE. |
lazy |
Logical. If TRUE, returns a lazy query object instead of a
tibble. Requires the arrow package. The lazy object supports
dplyr verbs (filter, select, mutate, etc.) which are pushed down
to the query engine before collecting into memory. Call
|
backend |
Character. Backend for lazy evaluation: |
PNAD Continua (Pesquisa Nacional por Amostra de Domicilios Continua) is a quarterly household survey conducted by IBGE. This function provides access to supplementary modules with health-related content.
deficiencia: Persons with disabilities (2019, 2022, 2024)
habitacao: Housing characteristics (2012-2019, 2022-2024)
moradores: General characteristics of residents (2012-2019, 2022-2024)
aps: Primary health care (2022)
For proper statistical analysis with complex survey design, the following variables are always included:
UPA: Primary sampling unit
Estrato: Stratum
V1028: Survey weight
Use as_survey = TRUE to get a properly weighted survey design object
for analysis with the srvyr package.
When downloading multiple years, install furrr and future and
set a parallel plan to speed up downloads:
future::plan(future::multisession, workers = 4). See
vignette("healthbR") for details.
A tibble with PNADC microdata, or a srvyr survey design object
if as_survey = TRUE.
Data is downloaded from the IBGE FTP server:
https://ftp.ibge.gov.br/Trabalho_e_Rendimento/Pesquisa_Nacional_por_Amostra_de_Domicilios_continua/
# download deficiencia module for 2022 df <- pnadc_data(module = "deficiencia", year = 2022, cache_dir = tempdir()) # download with survey design svy <- pnadc_data( module = "deficiencia", year = 2022, as_survey = TRUE, cache_dir = tempdir() ) # select specific variables df_subset <- pnadc_data( module = "deficiencia", year = 2022, vars = c("S11001", "S11002"), cache_dir = tempdir() )# download deficiencia module for 2022 df <- pnadc_data(module = "deficiencia", year = 2022, cache_dir = tempdir()) # download with survey design svy <- pnadc_data( module = "deficiencia", year = 2022, as_survey = TRUE, cache_dir = tempdir() ) # select specific variables df_subset <- pnadc_data( module = "deficiencia", year = 2022, vars = c("S11001", "S11002"), cache_dir = tempdir() )
Downloads and returns the variable dictionary for PNADC microdata. The dictionary is cached locally to avoid repeated downloads.
pnadc_dictionaries(module, year = NULL, cache_dir = NULL, refresh = FALSE)pnadc_dictionaries(module, year = NULL, cache_dir = NULL, refresh = FALSE)
module |
Character. The module identifier (e.g., "deficiencia", "habitacao"). |
year |
Numeric. Year to get dictionary for. Uses most recent year if NULL. |
cache_dir |
Character. Directory for caching downloaded files.
Default uses |
refresh |
Logical. If TRUE, re-download even if file exists in cache. Default is FALSE. |
The dictionary includes variable names, positions, and widths from the
IBGE input specification file. This is useful for understanding the structure
of the data returned by pnadc_data.
A tibble with variable definitions.
Dictionaries are downloaded from the IBGE FTP server.
# get dictionary for deficiencia module dict <- pnadc_dictionaries(module = "deficiencia", cache_dir = tempdir())# get dictionary for deficiencia module dict <- pnadc_dictionaries(module = "deficiencia", cache_dir = tempdir())
Displays information about PNAD Continua and returns metadata.
pnadc_info()pnadc_info()
Invisibly returns a list with survey metadata.
pnadc_info()pnadc_info()
Returns information about the available supplementary modules in PNAD Continua that are supported by this package.
pnadc_modules()pnadc_modules()
A tibble with module information including name, available years, and descriptions.
pnadc_modules()pnadc_modules()
Returns a list of available variables in the PNADC microdata for a given module.
This is a convenience wrapper around pnadc_dictionaries.
pnadc_variables(module, year = NULL, cache_dir = NULL, refresh = FALSE)pnadc_variables(module, year = NULL, cache_dir = NULL, refresh = FALSE)
module |
Character. The module identifier (e.g., "deficiencia", "habitacao"). |
year |
Numeric. Year to get variables for. Uses most recent year if NULL. |
cache_dir |
Character. Directory for caching downloaded files.
Default uses |
refresh |
Logical. If TRUE, re-download even if file exists in cache. Default is FALSE. |
A character vector of variable names.
# list variables for deficiencia module pnadc_variables(module = "deficiencia", cache_dir = tempdir())# list variables for deficiencia module pnadc_variables(module = "deficiencia", cache_dir = tempdir())
Returns a vector of years for which data is available for the specified module.
pnadc_years(module)pnadc_years(module)
module |
Character. The module identifier. Use |
An integer vector of available years.
pnadc_years("deficiencia") pnadc_years("habitacao")pnadc_years("deficiencia") pnadc_years("habitacao")
Shows cache status including downloaded files and their sizes.
pns_cache_status(cache_dir = NULL)pns_cache_status(cache_dir = NULL)
cache_dir |
Character. Optional custom cache directory. If NULL (default), uses the standard user cache directory. |
A tibble with cache information
pns_cache_status()pns_cache_status()
Removes all cached PNS data files.
pns_clear_cache(cache_dir = NULL)pns_clear_cache(cache_dir = NULL)
cache_dir |
Character. Optional custom cache directory. If NULL (default), uses the standard user cache directory. |
NULL (invisibly)
pns_clear_cache()pns_clear_cache()
Downloads and returns PNS microdata for specified years from the IBGE FTP.
Data is cached locally to avoid repeated downloads. When the arrow package
is installed, data is cached in parquet format for faster subsequent reads.
pns_data( year = NULL, vars = NULL, cache_dir = NULL, refresh = FALSE, lazy = FALSE, backend = c("arrow", "duckdb") )pns_data( year = NULL, vars = NULL, cache_dir = NULL, refresh = FALSE, lazy = FALSE, backend = c("arrow", "duckdb") )
year |
Numeric or vector. Year(s) to download (2013, 2019). Use NULL to download all available years. Default is NULL. |
vars |
Character vector. Variables to select. Use NULL for all variables. Default is NULL. |
cache_dir |
Character. Directory for caching downloaded files.
Default uses |
refresh |
Logical. If TRUE, re-download even if file exists in cache. Default is FALSE. |
lazy |
Logical. If TRUE, returns a lazy query object instead of a
tibble. Requires the arrow package. The lazy object supports
dplyr verbs (filter, select, mutate, etc.) which are pushed down
to the query engine before collecting into memory. Call
|
backend |
Character. Backend for lazy evaluation: |
The PNS (Pesquisa Nacional de Saude) is a household survey conducted by IBGE in partnership with the Ministry of Health. It provides comprehensive data on health conditions, lifestyle, and healthcare access of the Brazilian population.
For proper statistical analysis with complex survey design, use the following
weight variables with the srvyr or survey packages:
V0028: household weight
V0029: selected person weight
V0030: person weight with non-response adjustment
UPA_PNS: primary sampling unit
V0024: stratum
When downloading multiple years, install furrr and future and
set a parallel plan to speed up downloads:
future::plan(future::multisession, workers = 4). See
vignette("healthbR") for details.
A tibble with PNS microdata.
Data is downloaded from the IBGE FTP server:
https://ftp.ibge.gov.br/PNS/
# download PNS 2019 data df <- pns_data(year = 2019, cache_dir = tempdir()) # download all years df_all <- pns_data(cache_dir = tempdir()) # select specific variables df_subset <- pns_data( year = 2019, vars = c("V0001", "C006", "C008", "V0028"), cache_dir = tempdir() )# download PNS 2019 data df <- pns_data(year = 2019, cache_dir = tempdir()) # download all years df_all <- pns_data(cache_dir = tempdir()) # select specific variables df_subset <- pns_data( year = 2019, vars = c("V0001", "C006", "C008", "V0028"), cache_dir = tempdir() )
Downloads and returns the variable dictionary for PNS microdata. The dictionary is cached locally to avoid repeated downloads.
pns_dictionary(year = 2019, cache_dir = NULL, refresh = FALSE)pns_dictionary(year = 2019, cache_dir = NULL, refresh = FALSE)
year |
Numeric. Year to get dictionary for (2013 or 2019). Default is 2019. |
cache_dir |
Character. Directory for caching downloaded files.
Default uses |
refresh |
Logical. If TRUE, re-download even if file exists in cache. Default is FALSE. |
The dictionary includes variable names, labels, and response categories
for the PNS microdata. This is useful for understanding the structure
of the data returned by pns_data.
A tibble with variable definitions.
Dictionaries are downloaded from the IBGE FTP server:
https://ftp.ibge.gov.br/PNS/
# get dictionary for 2019 dict <- pns_dictionary(year = 2019, cache_dir = tempdir()) # get dictionary for 2013 dict_2013 <- pns_dictionary(year = 2013, cache_dir = tempdir())# get dictionary for 2019 dict <- pns_dictionary(year = 2019, cache_dir = tempdir()) # get dictionary for 2013 dict_2013 <- pns_dictionary(year = 2013, cache_dir = tempdir())
Displays information about the PNS survey and returns metadata.
pns_info(year = NULL)pns_info(year = NULL)
year |
Numeric. Year to get specific information about. NULL shows general info. |
Invisibly returns a list with survey metadata.
pns_info() pns_info(2019)pns_info() pns_info(2019)
Returns information about the questionnaire modules available in the PNS.
pns_modules(year = NULL)pns_modules(year = NULL)
year |
Numeric. Year to get modules for (2013 or 2019). NULL returns modules for all years. Default is NULL. |
A tibble with module codes, names, and descriptions.
pns_modules() pns_modules(year = 2019)pns_modules() pns_modules(year = 2019)
Queries the IBGE SIDRA API to retrieve tabulated PNS indicators. Returns pre-aggregated data (prevalences, means, proportions) with confidence intervals and coefficients of variation.
pns_sidra_data( table, territorial_level = "brazil", geo_code = "all", year = NULL, variable = NULL, classifications = NULL, raw = FALSE )pns_sidra_data( table, territorial_level = "brazil", geo_code = "all", year = NULL, variable = NULL, classifications = NULL, raw = FALSE )
table |
Numeric or character. SIDRA table code.
Use |
territorial_level |
Character. Geographic level: "brazil" (N1), "region" (N2), "state" (N3), "municipality" (N6). Default "brazil". |
geo_code |
Character. IBGE code(s) for specific localities. "all" returns all localities at the chosen level. Default "all". |
year |
Numeric. Year(s) to query. NULL returns all available periods. |
variable |
Numeric or character. SIDRA variable ID(s). NULL returns all variables excluding metadata. Default NULL. |
classifications |
Named list. SIDRA classification filters. Example: list("2" = "6794") for sex = total. NULL returns default aggregation. Default NULL. |
raw |
Logical. If TRUE, returns raw API output without cleaning. Default FALSE. |
A tibble with queried indicators.
# self-rated health by state, 2019 pns_sidra_data( table = 4751, territorial_level = "state", year = 2019 ) # same table, Brazil-level, both years pns_sidra_data( table = 4751, territorial_level = "brazil", year = c(2013, 2019) ) # hypertension data pns_sidra_data( table = 4416, territorial_level = "brazil" )# self-rated health by state, 2019 pns_sidra_data( table = 4751, territorial_level = "state", year = 2019 ) # same table, Brazil-level, both years pns_sidra_data( table = 4751, territorial_level = "brazil", year = c(2013, 2019) ) # hypertension data pns_sidra_data( table = 4416, territorial_level = "brazil" )
Searches PNS SIDRA tables by keyword in the table name/description. Supports partial matching, case-insensitive, and accent-insensitive search.
pns_sidra_search(keyword, year = NULL)pns_sidra_search(keyword, year = NULL)
keyword |
Character. Search term (minimum 2 characters). |
year |
Numeric. Filter tables containing data for this year. NULL returns all. |
A tibble with matching tables (same structure as pns_sidra_tables()).
pns_sidra_search("diabetes") pns_sidra_search("hipertensao") pns_sidra_search("fumante")pns_sidra_search("diabetes") pns_sidra_search("hipertensao") pns_sidra_search("fumante")
Returns a catalog of available SIDRA tables for the PNS, organized by health theme.
pns_sidra_tables(theme = NULL, year = NULL)pns_sidra_tables(theme = NULL, year = NULL)
theme |
Character. Filter by theme. NULL returns all themes. Available themes: "chronic_diseases", "lifestyle", "health_services", "health_perception", "womens_health", "accidents_violence", "oral_health", "anthropometry", "health_insurance", "disability", "elderly", "tobacco", "alcohol", "physical_activity", "nutrition", "medications", "mental_health", "work_health", "child_health". |
year |
Numeric. Filter tables that contain data for this year. NULL returns tables for all years. |
A tibble with columns: table_code, table_name, theme, theme_label, years, territorial_levels.
# list all tables pns_sidra_tables() # filter by theme pns_sidra_tables(theme = "chronic_diseases") # tables with 2013 data pns_sidra_tables(year = 2013)# list all tables pns_sidra_tables() # filter by theme pns_sidra_tables(theme = "chronic_diseases") # tables with 2013 data pns_sidra_tables(year = 2013)
Returns a list of available variables in the PNS microdata with their labels.
This is a convenience wrapper around pns_dictionary that
returns only unique variable names and labels.
pns_variables(year = 2019, module = NULL, cache_dir = NULL, refresh = FALSE)pns_variables(year = 2019, module = NULL, cache_dir = NULL, refresh = FALSE)
year |
Numeric. Year to get variables for (2013 or 2019). Default is 2019. |
module |
Character. Filter by module code (e.g., "J", "K", "L"). NULL returns all modules. Default is NULL. |
cache_dir |
Character. Directory for caching downloaded files.
Default uses |
refresh |
Logical. If TRUE, re-download even if file exists in cache. Default is FALSE. |
A tibble with variable names and labels.
# list all variables for 2019 pns_variables(year = 2019, cache_dir = tempdir()) # list variables for a specific module pns_variables(year = 2019, module = "J", cache_dir = tempdir())# list all variables for 2019 pns_variables(year = 2019, cache_dir = tempdir()) # list variables for a specific module pns_variables(year = 2019, module = "J", cache_dir = tempdir())
Returns a character vector with available PNS survey years.
pns_years()pns_years()
A character vector of available years.
pns_years()pns_years()
Shows cache status including downloaded files and their sizes.
pof_cache_status(cache_dir = NULL)pof_cache_status(cache_dir = NULL)
cache_dir |
Character. Optional custom cache directory. If NULL (default), uses the standard user cache directory. |
A tibble with cache information
Other pof:
pof_clear_cache(),
pof_data(),
pof_dictionary(),
pof_info(),
pof_registers(),
pof_variables(),
pof_years()
pof_cache_status()pof_cache_status()
Removes all cached POF data files.
pof_clear_cache(cache_dir = NULL)pof_clear_cache(cache_dir = NULL)
cache_dir |
Character. Optional custom cache directory. If NULL (default), uses the standard user cache directory. |
NULL (invisibly)
Other pof:
pof_cache_status(),
pof_data(),
pof_dictionary(),
pof_info(),
pof_registers(),
pof_variables(),
pof_years()
pof_clear_cache()pof_clear_cache()
Downloads POF microdata from IBGE FTP and returns as a tibble. Data is cached locally to avoid repeated downloads.
pof_data( year = "2017-2018", register = "morador", vars = NULL, cache_dir = NULL, as_survey = FALSE, refresh = FALSE, lazy = FALSE, backend = c("arrow", "duckdb") )pof_data( year = "2017-2018", register = "morador", vars = NULL, cache_dir = NULL, as_survey = FALSE, refresh = FALSE, lazy = FALSE, backend = c("arrow", "duckdb") )
year |
Character. POF edition (e.g., "2017-2018"). Default is "2017-2018". |
register |
Character. Which register to download.
Use |
vars |
Character vector. Optional: specific variables to select. If NULL, returns all variables from the register. Default is NULL. |
cache_dir |
Character. Directory for caching downloaded files.
Default uses |
as_survey |
Logical. If TRUE, returns survey design object. Requires srvyr package. Default is FALSE. |
refresh |
Logical. If TRUE, re-download even if file exists in cache. Default is FALSE. |
lazy |
Logical. If TRUE, returns a lazy query object instead of a
tibble. Requires the arrow package. The lazy object supports
dplyr verbs (filter, select, mutate, etc.) which are pushed down
to the query engine before collecting into memory. Call
|
backend |
Character. Backend for lazy evaluation: |
The POF (Pesquisa de Orcamentos Familiares) is a household survey conducted by IBGE that investigates household budgets, living conditions, and nutritional profiles of the Brazilian population.
The POF contains several health-related modules:
EBIA (Food Security Scale): Available in 2017-2018, variable V6199 in the domicilio register
Food Consumption: Detailed food consumption data in the consumo_alimentar register (2008-2009, 2017-2018)
Health Expenses: Expenses with medications, health insurance, consultations in the despesa_individual register
Anthropometry: Weight, height, BMI in morador register (2008-2009 only)
For proper statistical analysis with complex survey design, use
as_survey = TRUE which creates a survey design object with:
Weight variable: PESO_FINAL
Stratum variable: ESTRATO_POF
PSU variable: COD_UPA
A tibble with microdata, or tbl_svy if as_survey = TRUE.
Data is downloaded from the IBGE FTP server:
https://ftp.ibge.gov.br/Orcamentos_Familiares/
pof_years, pof_info,
pof_registers, pof_variables
Other pof:
pof_cache_status(),
pof_clear_cache(),
pof_dictionary(),
pof_info(),
pof_registers(),
pof_variables(),
pof_years()
# basic usage - download morador register morador <- pof_data("2017-2018", "morador", cache_dir = tempdir()) # download domicilio register (includes EBIA) domicilio <- pof_data("2017-2018", "domicilio", cache_dir = tempdir()) # select specific variables df <- pof_data( "2017-2018", "morador", vars = c("COD_UPA", "ESTRATO_POF", "PESO_FINAL", "V0403"), cache_dir = tempdir() ) # with survey design (requires srvyr package) morador_svy <- pof_data("2017-2018", "morador", as_survey = TRUE, cache_dir = tempdir())# basic usage - download morador register morador <- pof_data("2017-2018", "morador", cache_dir = tempdir()) # download domicilio register (includes EBIA) domicilio <- pof_data("2017-2018", "domicilio", cache_dir = tempdir()) # select specific variables df <- pof_data( "2017-2018", "morador", vars = c("COD_UPA", "ESTRATO_POF", "PESO_FINAL", "V0403"), cache_dir = tempdir() ) # with survey design (requires srvyr package) morador_svy <- pof_data("2017-2018", "morador", as_survey = TRUE, cache_dir = tempdir())
Downloads and returns the variable dictionary for POF microdata. The dictionary is cached locally to avoid repeated downloads.
pof_dictionary( year = "2017-2018", register = NULL, cache_dir = NULL, refresh = FALSE )pof_dictionary( year = "2017-2018", register = NULL, cache_dir = NULL, refresh = FALSE )
year |
Character. POF edition (e.g., "2017-2018"). Default is "2017-2018". |
register |
Character. Register name. If NULL, returns all registers. Default is NULL. |
cache_dir |
Character. Directory for caching downloaded files.
Default uses |
refresh |
Logical. If TRUE, re-download even if file exists in cache. Default is FALSE. |
A tibble with variable definitions including: variable, description, position, length, decimals, register.
Other pof:
pof_cache_status(),
pof_clear_cache(),
pof_data(),
pof_info(),
pof_registers(),
pof_variables(),
pof_years()
pof_dictionary("2017-2018", "morador", cache_dir = tempdir())pof_dictionary("2017-2018", "morador", cache_dir = tempdir())
Returns metadata about the POF survey edition including available health modules and sampling design information.
pof_info(year = "2017-2018")pof_info(year = "2017-2018")
year |
Character. POF edition (e.g., "2017-2018"). Default is "2017-2018". |
A list with survey metadata (invisibly).
Other pof:
pof_cache_status(),
pof_clear_cache(),
pof_data(),
pof_dictionary(),
pof_registers(),
pof_variables(),
pof_years()
pof_info() pof_info("2017-2018") pof_info("2008-2009")pof_info() pof_info("2017-2018") pof_info("2008-2009")
Returns information about the data registers available in the POF.
pof_registers(year = "2017-2018", health_only = FALSE)pof_registers(year = "2017-2018", health_only = FALSE)
year |
Character. POF edition (e.g., "2017-2018"). Default is "2017-2018". |
health_only |
Logical. If TRUE, returns only health-related registers. Default is FALSE. |
A tibble with register names and descriptions.
Other pof:
pof_cache_status(),
pof_clear_cache(),
pof_data(),
pof_dictionary(),
pof_info(),
pof_variables(),
pof_years()
pof_registers() pof_registers("2017-2018", health_only = TRUE)pof_registers() pof_registers("2017-2018", health_only = TRUE)
Returns a list of available variables in the POF microdata with their labels.
This is a convenience wrapper around pof_dictionary that
returns a simplified view.
pof_variables( year = "2017-2018", register = NULL, search = NULL, cache_dir = NULL, refresh = FALSE )pof_variables( year = "2017-2018", register = NULL, search = NULL, cache_dir = NULL, refresh = FALSE )
year |
Character. POF edition (e.g., "2017-2018"). Default is "2017-2018". |
register |
Character. Register name (e.g., "morador", "domicilio"). If NULL, returns variables from all registers. Default is NULL. |
search |
Character. Optional search term to filter variables by name or description. Default is NULL. |
cache_dir |
Character. Directory for caching downloaded files.
Default uses |
refresh |
Logical. If TRUE, re-download even if file exists in cache. Default is FALSE. |
A tibble with columns: variable, description, position, length, register.
Other pof:
pof_cache_status(),
pof_clear_cache(),
pof_data(),
pof_dictionary(),
pof_info(),
pof_registers(),
pof_years()
pof_variables("2017-2018", "morador", cache_dir = tempdir()) pof_variables("2017-2018", "domicilio", search = "ebia", cache_dir = tempdir())pof_variables("2017-2018", "morador", cache_dir = tempdir()) pof_variables("2017-2018", "domicilio", search = "ebia", cache_dir = tempdir())
Returns a character vector with available POF survey years.
pof_years()pof_years()
A character vector of available years in "YYYY-YYYY" format.
Other pof:
pof_cache_status(),
pof_clear_cache(),
pof_data(),
pof_dictionary(),
pof_info(),
pof_registers(),
pof_variables()
pof_years()pof_years()
Shows information about cached SIA data files.
sia_cache_status(cache_dir = NULL)sia_cache_status(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
A tibble with cache file information (invisibly).
Other sia:
sia_clear_cache(),
sia_data(),
sia_dictionary(),
sia_info(),
sia_variables(),
sia_years()
sia_cache_status()sia_cache_status()
Deletes cached SIA data files.
sia_clear_cache(cache_dir = NULL)sia_clear_cache(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
Invisible NULL.
Other sia:
sia_cache_status(),
sia_data(),
sia_dictionary(),
sia_info(),
sia_variables(),
sia_years()
sia_clear_cache()sia_clear_cache()
Downloads and returns outpatient production microdata from DATASUS FTP. Each row represents one outpatient production record. Data is organized monthly – one .dbc file per type, state (UF), and month.
sia_data( year, type = "PA", month = NULL, vars = NULL, uf = NULL, procedure = NULL, diagnosis = NULL, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )sia_data( year, type = "PA", month = NULL, vars = NULL, uf = NULL, procedure = NULL, diagnosis = NULL, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )
year |
Integer. Year(s) of the data. Required. |
type |
Character. File type to download. Default: |
month |
Integer. Month(s) of the data (1-12). If NULL (default),
downloads all 12 months. Example: |
vars |
Character vector. Variables to keep. If NULL (default),
returns all available variables. Use |
uf |
Character. Two-letter state abbreviation(s) to download.
If NULL (default), downloads all 27 states.
Example: |
procedure |
Character. SIGTAP procedure code pattern(s) to filter by
( |
diagnosis |
Character. CID-10 code pattern(s) to filter by principal
diagnosis ( |
parse |
Logical. If TRUE (default), converts columns to
appropriate types (integer, double, Date) based on the variable
metadata. Use |
col_types |
Named list. Override the default type for specific
columns. Names are column names, values are type strings:
|
cache |
Logical. If TRUE (default), caches downloaded data for faster future access. |
cache_dir |
Character. Directory for caching. Default:
|
lazy |
Logical. If TRUE, returns a lazy query object instead of a
tibble. Requires the arrow package. The lazy object supports
dplyr verbs (filter, select, mutate, etc.) which are pushed down
to the query engine before collecting into memory. Call
|
backend |
Character. Backend for lazy evaluation: |
Data is downloaded from DATASUS FTP as .dbc files (one per type/state/month). The .dbc format is decompressed internally using vendored C code from the blast library. No external dependencies are required.
SIA data is monthly, so downloading an entire year for all states requires
324 files (27 UFs x 12 months) per type. Use uf and month
to limit downloads.
The SIA has 13 file types. The default "PA" (outpatient production)
is the most commonly used. Use sia_info() to see all types.
When downloading multiple files (e.g., several months or states), install
furrr and future and set a parallel plan to speed up downloads:
future::plan(future::multisession, workers = 4). See
vignette("healthbR") for details.
A tibble with outpatient production microdata. Includes columns
year, month, and uf_source to identify the source
when multiple years/months/states are combined.
sia_info() for file type descriptions,
censo_populacao() for population denominators.
Other sia:
sia_cache_status(),
sia_clear_cache(),
sia_dictionary(),
sia_info(),
sia_variables(),
sia_years()
# all outpatient production in Acre, January 2022 ac_jan <- sia_data(year = 2022, month = 1, uf = "AC") # filter by procedure code consult <- sia_data(year = 2022, month = 1, uf = "AC", procedure = "0301") # filter by diagnosis (CID-10) resp <- sia_data(year = 2022, month = 1, uf = "AC", diagnosis = "J") # only key variables sia_data(year = 2022, month = 1, uf = "AC", vars = c("PA_PROC_ID", "PA_CIDPRI", "PA_SEXO", "PA_IDADE", "PA_VALAPR")) # different file type (APAC Medicamentos) med <- sia_data(year = 2022, month = 1, uf = "AC", type = "AM")# all outpatient production in Acre, January 2022 ac_jan <- sia_data(year = 2022, month = 1, uf = "AC") # filter by procedure code consult <- sia_data(year = 2022, month = 1, uf = "AC", procedure = "0301") # filter by diagnosis (CID-10) resp <- sia_data(year = 2022, month = 1, uf = "AC", diagnosis = "J") # only key variables sia_data(year = 2022, month = 1, uf = "AC", vars = c("PA_PROC_ID", "PA_CIDPRI", "PA_SEXO", "PA_IDADE", "PA_VALAPR")) # different file type (APAC Medicamentos) med <- sia_data(year = 2022, month = 1, uf = "AC", type = "AM")
Returns a tibble with the complete data dictionary for the SIA, including variable descriptions and category labels.
sia_dictionary(variable = NULL)sia_dictionary(variable = NULL)
variable |
Character. If provided, returns dictionary for a specific variable only. Default: NULL (returns all variables). |
A tibble with columns: variable, description, code, label.
Other sia:
sia_cache_status(),
sia_clear_cache(),
sia_data(),
sia_info(),
sia_variables(),
sia_years()
sia_dictionary() sia_dictionary("PA_SEXO") sia_dictionary("PA_RACACOR")sia_dictionary() sia_dictionary("PA_SEXO") sia_dictionary("PA_RACACOR")
Displays information about the Outpatient Information System (SIA), including data sources, available years, file types, and usage guidance.
sia_info()sia_info()
A list with module information (invisibly).
Other sia:
sia_cache_status(),
sia_clear_cache(),
sia_data(),
sia_dictionary(),
sia_variables(),
sia_years()
sia_info()sia_info()
Returns a tibble with available variables in the SIA microdata (PA type), including descriptions and value types.
sia_variables(type = "PA", search = NULL)sia_variables(type = "PA", search = NULL)
type |
Character. File type to show variables for. Currently only
|
search |
Character. Optional search term to filter variables by name or description. Case-insensitive and accent-insensitive. |
A tibble with columns: variable, description, type, section.
Other sia:
sia_cache_status(),
sia_clear_cache(),
sia_data(),
sia_dictionary(),
sia_info(),
sia_years()
sia_variables() sia_variables(search = "sexo") sia_variables(search = "procedimento")sia_variables() sia_variables(search = "sexo") sia_variables(search = "procedimento")
Returns an integer vector with years for which outpatient production microdata are available from DATASUS FTP.
sia_years(status = "final")sia_years(status = "final")
status |
Character. Filter by data status. One of:
|
An integer vector of available years.
Other sia:
sia_cache_status(),
sia_clear_cache(),
sia_data(),
sia_dictionary(),
sia_info(),
sia_variables()
sia_years() sia_years(status = "all")sia_years() sia_years(status = "all")
Shows information about cached SIH data files.
sih_cache_status(cache_dir = NULL)sih_cache_status(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
A tibble with cache file information (invisibly).
Other sih:
sih_clear_cache(),
sih_data(),
sih_dictionary(),
sih_info(),
sih_variables(),
sih_years()
sih_cache_status()sih_cache_status()
Deletes cached SIH data files.
sih_clear_cache(cache_dir = NULL)sih_clear_cache(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
Invisible NULL.
Other sih:
sih_cache_status(),
sih_data(),
sih_dictionary(),
sih_info(),
sih_variables(),
sih_years()
sih_clear_cache()sih_clear_cache()
Downloads and returns hospital admission microdata from DATASUS FTP. Each row represents one hospital admission record (AIH). Data is organized monthly – one .dbc file per state (UF) per month.
sih_data( year, month = NULL, vars = NULL, uf = NULL, diagnosis = NULL, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )sih_data( year, month = NULL, vars = NULL, uf = NULL, diagnosis = NULL, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )
year |
Integer. Year(s) of the data. Required. |
month |
Integer. Month(s) of the data (1-12). If NULL (default),
downloads all 12 months. Example: |
vars |
Character vector. Variables to keep. If NULL (default),
returns all available variables. Use |
uf |
Character. Two-letter state abbreviation(s) to download.
If NULL (default), downloads all 27 states.
Example: |
diagnosis |
Character. CID-10 code pattern(s) to filter by principal
diagnosis ( |
parse |
Logical. If TRUE (default), converts columns to
appropriate types (integer, double, Date) based on the variable
metadata. Use |
col_types |
Named list. Override the default type for specific
columns. Names are column names, values are type strings:
|
cache |
Logical. If TRUE (default), caches downloaded data for faster future access. |
cache_dir |
Character. Directory for caching. Default:
|
lazy |
Logical. If TRUE, returns a lazy query object instead of a
tibble. Requires the arrow package. The lazy object supports
dplyr verbs (filter, select, mutate, etc.) which are pushed down
to the query engine before collecting into memory. Call
|
backend |
Character. Backend for lazy evaluation: |
Data is downloaded from DATASUS FTP as .dbc files (one per state per month). The .dbc format is decompressed internally using vendored C code from the blast library. No external dependencies are required.
SIH data is monthly, so downloading an entire year for all states requires
324 files (27 UFs x 12 months). Use uf and month to limit downloads.
When downloading multiple files (e.g., several months or states), install
furrr and future and set a parallel plan to speed up downloads:
future::plan(future::multisession, workers = 4). See
vignette("healthbR") for details.
A tibble with hospital admission microdata. Includes columns
year, month, and uf_source to identify the source when multiple
years/months/states are combined.
censo_populacao() for population denominators to calculate
hospitalization rates.
Other sih:
sih_cache_status(),
sih_clear_cache(),
sih_dictionary(),
sih_info(),
sih_variables(),
sih_years()
# all admissions in Acre, January 2022 ac_jan <- sih_data(year = 2022, month = 1, uf = "AC") # heart attacks in Sao Paulo, first semester 2022 infarct_sp <- sih_data(year = 2022, month = 1:6, uf = "SP", diagnosis = "I21") # only key variables, Rio de Janeiro, March 2022 sih_data(year = 2022, month = 3, uf = "RJ", vars = c("DIAG_PRINC", "DT_INTER", "SEXO", "IDADE", "MORTE", "VAL_TOT"))# all admissions in Acre, January 2022 ac_jan <- sih_data(year = 2022, month = 1, uf = "AC") # heart attacks in Sao Paulo, first semester 2022 infarct_sp <- sih_data(year = 2022, month = 1:6, uf = "SP", diagnosis = "I21") # only key variables, Rio de Janeiro, March 2022 sih_data(year = 2022, month = 3, uf = "RJ", vars = c("DIAG_PRINC", "DT_INTER", "SEXO", "IDADE", "MORTE", "VAL_TOT"))
Returns a tibble with the complete data dictionary for the SIH, including variable descriptions and category labels.
sih_dictionary(variable = NULL)sih_dictionary(variable = NULL)
variable |
Character. If provided, returns dictionary for a specific variable only. Default: NULL (returns all variables). |
A tibble with columns: variable, description, code, label.
Other sih:
sih_cache_status(),
sih_clear_cache(),
sih_data(),
sih_info(),
sih_variables(),
sih_years()
sih_dictionary() sih_dictionary("SEXO") sih_dictionary("CAR_INT")sih_dictionary() sih_dictionary("SEXO") sih_dictionary("CAR_INT")
Displays information about the Hospital Information System (SIH), including data sources, available years, and usage guidance.
sih_info()sih_info()
A list with module information (invisibly).
Other sih:
sih_cache_status(),
sih_clear_cache(),
sih_data(),
sih_dictionary(),
sih_variables(),
sih_years()
sih_info()sih_info()
Returns a tibble with available variables in the SIH microdata, including descriptions and value types.
sih_variables(year = NULL, search = NULL)sih_variables(year = NULL, search = NULL)
year |
Integer. If provided, returns variables available for that specific year (reserved for future use). Default: NULL. |
search |
Character. Optional search term to filter variables by name or description. Case-insensitive. |
A tibble with columns: variable, description, type, section.
Other sih:
sih_cache_status(),
sih_clear_cache(),
sih_data(),
sih_dictionary(),
sih_info(),
sih_years()
sih_variables() sih_variables(search = "diag") sih_variables(search = "valor")sih_variables() sih_variables(search = "diag") sih_variables(search = "valor")
Returns an integer vector with years for which hospital admission microdata are available from DATASUS FTP.
sih_years(status = "final")sih_years(status = "final")
status |
Character. Filter by data status. One of:
|
An integer vector of available years.
Other sih:
sih_cache_status(),
sih_clear_cache(),
sih_data(),
sih_dictionary(),
sih_info(),
sih_variables()
sih_years() sih_years(status = "all")sih_years() sih_years(status = "all")
Shows information about cached SIM data files.
sim_cache_status(cache_dir = NULL)sim_cache_status(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
A tibble with cache file information (invisibly).
Other sim:
sim_clear_cache(),
sim_data(),
sim_dictionary(),
sim_info(),
sim_variables(),
sim_years()
sim_cache_status()sim_cache_status()
Deletes cached SIM data files.
sim_clear_cache(cache_dir = NULL)sim_clear_cache(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
Invisible NULL.
Other sim:
sim_cache_status(),
sim_data(),
sim_dictionary(),
sim_info(),
sim_variables(),
sim_years()
sim_clear_cache()sim_clear_cache()
Downloads and returns mortality microdata from DATASUS FTP. Each row represents one death record (Declaracao de Obito). Data is downloaded per state (UF) as compressed .dbc files, decompressed internally, and returned as a tibble.
sim_data( year, vars = NULL, uf = NULL, cause = NULL, decode_age = TRUE, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )sim_data( year, vars = NULL, uf = NULL, cause = NULL, decode_age = TRUE, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )
year |
Integer. Year(s) of the data. Required. |
vars |
Character vector. Variables to keep. If NULL (default),
returns all available variables. Use |
uf |
Character. Two-letter state abbreviation(s) to download.
If NULL (default), downloads all 27 states.
Example: |
cause |
Character. CID-10 code pattern(s) to filter by cause of
death ( |
decode_age |
Logical. If TRUE (default), adds a numeric column
|
parse |
Logical. If TRUE (default), converts columns to
appropriate types (integer, double, Date) based on the variable
metadata. Use |
col_types |
Named list. Override the default type for specific
columns. Names are column names, values are type strings:
|
cache |
Logical. If TRUE (default), caches downloaded data for faster future access. |
cache_dir |
Character. Directory for caching. Default:
|
lazy |
Logical. If TRUE, returns a lazy query object instead of a
tibble. Requires the arrow package. The lazy object supports
dplyr verbs (filter, select, mutate, etc.) which are pushed down
to the query engine before collecting into memory. Call
|
backend |
Character. Backend for lazy evaluation: |
Data is downloaded from DATASUS FTP as .dbc files (one per state per year). The .dbc format is decompressed internally using vendored C code from the blast library. No external dependencies are required.
When uf is specified, only the requested state(s) are downloaded,
making the operation much faster than downloading the entire country.
When downloading multiple files (e.g., several years or states), install
furrr and future and set a parallel plan to speed up downloads:
future::plan(future::multisession, workers = 4). See
vignette("healthbR") for details.
A tibble with mortality microdata. Includes columns year
and uf_source to identify the source when multiple years/states
are combined.
censo_populacao() for population denominators to calculate
mortality rates.
Other sim:
sim_cache_status(),
sim_clear_cache(),
sim_dictionary(),
sim_info(),
sim_variables(),
sim_years()
# all deaths in Acre, 2022 ac_2022 <- sim_data(year = 2022, uf = "AC") # deaths by infarct in Sao Paulo, 2020-2022 infarct_sp <- sim_data(year = 2020:2022, uf = "SP", cause = "I21") # only key variables, Rio de Janeiro, 2022 sim_data(year = 2022, uf = "RJ", vars = c("DTOBITO", "SEXO", "IDADE", "RACACOR", "CODMUNRES", "CAUSABAS"))# all deaths in Acre, 2022 ac_2022 <- sim_data(year = 2022, uf = "AC") # deaths by infarct in Sao Paulo, 2020-2022 infarct_sp <- sim_data(year = 2020:2022, uf = "SP", cause = "I21") # only key variables, Rio de Janeiro, 2022 sim_data(year = 2022, uf = "RJ", vars = c("DTOBITO", "SEXO", "IDADE", "RACACOR", "CODMUNRES", "CAUSABAS"))
Returns a tibble with the complete data dictionary for the SIM, including variable descriptions and category labels.
sim_dictionary(variable = NULL)sim_dictionary(variable = NULL)
variable |
Character. If provided, returns dictionary for a specific variable only. Default: NULL (returns all variables). |
A tibble with columns: variable, description, code, label.
Other sim:
sim_cache_status(),
sim_clear_cache(),
sim_data(),
sim_info(),
sim_variables(),
sim_years()
sim_dictionary() sim_dictionary("SEXO") sim_dictionary("RACACOR")sim_dictionary() sim_dictionary("SEXO") sim_dictionary("RACACOR")
Displays information about the Mortality Information System (SIM), including data sources, available years, and usage guidance.
sim_info()sim_info()
A list with module information (invisibly).
Other sim:
sim_cache_status(),
sim_clear_cache(),
sim_data(),
sim_dictionary(),
sim_variables(),
sim_years()
sim_info()sim_info()
Returns a tibble with available variables in the SIM microdata, including descriptions and value types.
sim_variables(year = NULL, search = NULL)sim_variables(year = NULL, search = NULL)
year |
Integer. If provided, returns variables available for that specific year (reserved for future use). Default: NULL. |
search |
Character. Optional search term to filter variables by name or description. Case-insensitive. |
A tibble with columns: variable, description, type, section.
Other sim:
sim_cache_status(),
sim_clear_cache(),
sim_data(),
sim_dictionary(),
sim_info(),
sim_years()
sim_variables() sim_variables(search = "causa") sim_variables(search = "mae")sim_variables() sim_variables(search = "causa") sim_variables(search = "mae")
Returns an integer vector with years for which mortality microdata are available from DATASUS FTP.
sim_years(status = "final")sim_years(status = "final")
status |
Character. Filter by data status. One of:
|
An integer vector of available years.
Other sim:
sim_cache_status(),
sim_clear_cache(),
sim_data(),
sim_dictionary(),
sim_info(),
sim_variables()
sim_years() sim_years(status = "all")sim_years() sim_years(status = "all")
Shows information about cached SINAN data files.
sinan_cache_status(cache_dir = NULL)sinan_cache_status(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
A tibble with cache file information (invisibly).
Other sinan:
sinan_clear_cache(),
sinan_data(),
sinan_dictionary(),
sinan_diseases(),
sinan_info(),
sinan_variables(),
sinan_years()
sinan_cache_status()sinan_cache_status()
Deletes cached SINAN data files.
sinan_clear_cache(cache_dir = NULL)sinan_clear_cache(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
Invisible NULL.
Other sinan:
sinan_cache_status(),
sinan_data(),
sinan_dictionary(),
sinan_diseases(),
sinan_info(),
sinan_variables(),
sinan_years()
sinan_clear_cache()sinan_clear_cache()
Downloads and returns notifiable disease microdata from DATASUS FTP. Each row represents one notification record (Ficha de Notificacao). Data is downloaded as national .dbc files (one file per disease per year), decompressed internally, and returned as a tibble.
sinan_data( year, disease = "DENG", vars = NULL, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )sinan_data( year, disease = "DENG", vars = NULL, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )
year |
Integer. Year(s) of the data. Required. |
disease |
Character. Disease code to download. Default: |
vars |
Character vector. Variables to keep. If NULL (default),
returns all available variables. Use |
parse |
Logical. If TRUE (default), converts columns to
appropriate types (integer, double, Date) based on the variable
metadata. Use |
col_types |
Named list. Override the default type for specific
columns. Names are column names, values are type strings:
|
cache |
Logical. If TRUE (default), caches downloaded data for faster future access. |
cache_dir |
Character. Directory for caching. Default:
|
lazy |
Logical. If TRUE, returns a lazy query object instead of a
tibble. Requires the arrow package. The lazy object supports
dplyr verbs (filter, select, mutate, etc.) which are pushed down
to the query engine before collecting into memory. Call
|
backend |
Character. Backend for lazy evaluation: |
SINAN files are national (not per-state). Each file contains all
notifications for a given disease in a given year across all of Brazil.
To filter by state, use the SG_UF_NOT (UF of notification) or
ID_MUNICIP (municipality code) columns after download.
Data is downloaded from DATASUS FTP as .dbc files. The .dbc format is decompressed internally using vendored C code from the blast library. No external dependencies are required.
When downloading multiple files (e.g., several years or diseases), install
furrr and future and set a parallel plan to speed up downloads:
future::plan(future::multisession, workers = 4). See
vignette("healthbR") for details.
A tibble with notifiable disease microdata. Includes columns
year and disease to identify the source when multiple years are
combined.
Other sinan:
sinan_cache_status(),
sinan_clear_cache(),
sinan_dictionary(),
sinan_diseases(),
sinan_info(),
sinan_variables(),
sinan_years()
# dengue notifications, 2022 dengue_2022 <- sinan_data(year = 2022) # tuberculosis, 2020-2022 tb <- sinan_data(year = 2020:2022, disease = "TUBE") # only key variables sinan_data(year = 2022, disease = "DENG", vars = c("DT_NOTIFIC", "CS_SEXO", "NU_IDADE_N", "CS_RACA", "ID_MUNICIP", "CLASSI_FIN"))# dengue notifications, 2022 dengue_2022 <- sinan_data(year = 2022) # tuberculosis, 2020-2022 tb <- sinan_data(year = 2020:2022, disease = "TUBE") # only key variables sinan_data(year = 2022, disease = "DENG", vars = c("DT_NOTIFIC", "CS_SEXO", "NU_IDADE_N", "CS_RACA", "ID_MUNICIP", "CLASSI_FIN"))
Returns a tibble with the complete data dictionary for the SINAN, including variable descriptions and category labels.
sinan_dictionary(variable = NULL)sinan_dictionary(variable = NULL)
variable |
Character. If provided, returns dictionary for a specific variable only. Default: NULL (returns all variables). |
A tibble with columns: variable, description, code, label.
Other sinan:
sinan_cache_status(),
sinan_clear_cache(),
sinan_data(),
sinan_diseases(),
sinan_info(),
sinan_variables(),
sinan_years()
sinan_dictionary() sinan_dictionary("CS_SEXO") sinan_dictionary("EVOLUCAO")sinan_dictionary() sinan_dictionary("CS_SEXO") sinan_dictionary("EVOLUCAO")
Returns a tibble with all notifiable diseases (agravos) available in SINAN, including codes, names, and descriptions.
sinan_diseases(search = NULL)sinan_diseases(search = NULL)
search |
Character. Optional search term to filter diseases by code, name, or description. Case-insensitive and accent-insensitive. |
A tibble with columns: code, name, description.
Other sinan:
sinan_cache_status(),
sinan_clear_cache(),
sinan_data(),
sinan_dictionary(),
sinan_info(),
sinan_variables(),
sinan_years()
sinan_diseases() sinan_diseases(search = "dengue") sinan_diseases(search = "sifilis")sinan_diseases() sinan_diseases(search = "dengue") sinan_diseases(search = "sifilis")
Displays information about the Notifiable Diseases Information System (SINAN), including data sources, available years, diseases, and usage guidance.
sinan_info()sinan_info()
A list with module information (invisibly).
Other sinan:
sinan_cache_status(),
sinan_clear_cache(),
sinan_data(),
sinan_dictionary(),
sinan_diseases(),
sinan_variables(),
sinan_years()
sinan_info()sinan_info()
Returns a tibble with available variables in the SINAN microdata, including descriptions and value types.
sinan_variables(disease = "DENG", search = NULL)sinan_variables(disease = "DENG", search = NULL)
disease |
Character. Disease code (e.g., "DENG"). Currently not used for filtering but reserved for future disease-specific variables. Default: "DENG". |
search |
Character. Optional search term to filter variables by name or description. Case-insensitive and accent-insensitive. |
A tibble with columns: variable, description, type, section.
Other sinan:
sinan_cache_status(),
sinan_clear_cache(),
sinan_data(),
sinan_dictionary(),
sinan_diseases(),
sinan_info(),
sinan_years()
sinan_variables() sinan_variables(search = "sexo") sinan_variables(search = "municipio")sinan_variables() sinan_variables(search = "sexo") sinan_variables(search = "municipio")
Returns an integer vector with years for which notifiable diseases microdata are available from DATASUS FTP.
sinan_years(status = "final")sinan_years(status = "final")
status |
Character. Filter by data status. One of:
|
An integer vector of available years.
Other sinan:
sinan_cache_status(),
sinan_clear_cache(),
sinan_data(),
sinan_dictionary(),
sinan_diseases(),
sinan_info(),
sinan_variables()
sinan_years() sinan_years(status = "all")sinan_years() sinan_years(status = "all")
Shows information about cached SINASC data files.
sinasc_cache_status(cache_dir = NULL)sinasc_cache_status(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
A tibble with cache file information (invisibly).
Other sinasc:
sinasc_clear_cache(),
sinasc_data(),
sinasc_dictionary(),
sinasc_info(),
sinasc_variables(),
sinasc_years()
sinasc_cache_status()sinasc_cache_status()
Deletes cached SINASC data files.
sinasc_clear_cache(cache_dir = NULL)sinasc_clear_cache(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
Invisible NULL.
Other sinasc:
sinasc_cache_status(),
sinasc_data(),
sinasc_dictionary(),
sinasc_info(),
sinasc_variables(),
sinasc_years()
sinasc_clear_cache()sinasc_clear_cache()
Downloads and returns live birth microdata from DATASUS FTP. Each row represents one live birth record (Declaracao de Nascido Vivo). Data is downloaded per state (UF) as compressed .dbc files, decompressed internally, and returned as a tibble.
sinasc_data( year, vars = NULL, uf = NULL, anomaly = NULL, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )sinasc_data( year, vars = NULL, uf = NULL, anomaly = NULL, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )
year |
Integer. Year(s) of the data. Required. |
vars |
Character vector. Variables to keep. If NULL (default),
returns all available variables. Use |
uf |
Character. Two-letter state abbreviation(s) to download.
If NULL (default), downloads all 27 states.
Example: |
anomaly |
Character. CID-10 code pattern(s) to filter by congenital
anomaly ( |
parse |
Logical. If TRUE (default), converts columns to
appropriate types (integer, double, Date) based on the variable
metadata. Use |
col_types |
Named list. Override the default type for specific
columns. Names are column names, values are type strings:
|
cache |
Logical. If TRUE (default), caches downloaded data for faster future access. |
cache_dir |
Character. Directory for caching. Default:
|
lazy |
Logical. If TRUE, returns a lazy query object instead of a
tibble. Requires the arrow package. The lazy object supports
dplyr verbs (filter, select, mutate, etc.) which are pushed down
to the query engine before collecting into memory. Call
|
backend |
Character. Backend for lazy evaluation: |
Data is downloaded from DATASUS FTP as .dbc files (one per state per year). The .dbc format is decompressed internally using vendored C code from the blast library. No external dependencies are required.
When uf is specified, only the requested state(s) are downloaded,
making the operation much faster than downloading the entire country.
When downloading multiple files (e.g., several years or states), install
furrr and future and set a parallel plan to speed up downloads:
future::plan(future::multisession, workers = 4). See
vignette("healthbR") for details.
A tibble with live birth microdata. Includes columns year
and uf_source to identify the source when multiple years/states
are combined.
censo_populacao() for population denominators to calculate
birth rates.
Other sinasc:
sinasc_cache_status(),
sinasc_clear_cache(),
sinasc_dictionary(),
sinasc_info(),
sinasc_variables(),
sinasc_years()
# all births in Acre, 2022 ac_2022 <- sinasc_data(year = 2022, uf = "AC") # births with anomalies in Sao Paulo, 2020-2022 anomalies_sp <- sinasc_data(year = 2020:2022, uf = "SP", anomaly = "Q") # only key variables, Rio de Janeiro, 2022 sinasc_data(year = 2022, uf = "RJ", vars = c("DTNASC", "SEXO", "PESO", "IDADEMAE", "PARTO", "CONSULTAS"))# all births in Acre, 2022 ac_2022 <- sinasc_data(year = 2022, uf = "AC") # births with anomalies in Sao Paulo, 2020-2022 anomalies_sp <- sinasc_data(year = 2020:2022, uf = "SP", anomaly = "Q") # only key variables, Rio de Janeiro, 2022 sinasc_data(year = 2022, uf = "RJ", vars = c("DTNASC", "SEXO", "PESO", "IDADEMAE", "PARTO", "CONSULTAS"))
Returns a tibble with the complete data dictionary for the SINASC, including variable descriptions and category labels.
sinasc_dictionary(variable = NULL)sinasc_dictionary(variable = NULL)
variable |
Character. If provided, returns dictionary for a specific variable only. Default: NULL (returns all variables). |
A tibble with columns: variable, description, code, label.
Other sinasc:
sinasc_cache_status(),
sinasc_clear_cache(),
sinasc_data(),
sinasc_info(),
sinasc_variables(),
sinasc_years()
sinasc_dictionary() sinasc_dictionary("SEXO") sinasc_dictionary("PARTO")sinasc_dictionary() sinasc_dictionary("SEXO") sinasc_dictionary("PARTO")
Displays information about the Live Birth Information System (SINASC), including data sources, available years, and usage guidance.
sinasc_info()sinasc_info()
A list with module information (invisibly).
Other sinasc:
sinasc_cache_status(),
sinasc_clear_cache(),
sinasc_data(),
sinasc_dictionary(),
sinasc_variables(),
sinasc_years()
sinasc_info()sinasc_info()
Returns a tibble with available variables in the SINASC microdata, including descriptions and value types.
sinasc_variables(year = NULL, search = NULL)sinasc_variables(year = NULL, search = NULL)
year |
Integer. If provided, returns variables available for that specific year (reserved for future use). Default: NULL. |
search |
Character. Optional search term to filter variables by name or description. Case-insensitive. |
A tibble with columns: variable, description, type, section.
Other sinasc:
sinasc_cache_status(),
sinasc_clear_cache(),
sinasc_data(),
sinasc_dictionary(),
sinasc_info(),
sinasc_years()
sinasc_variables() sinasc_variables(search = "mae") sinasc_variables(search = "parto")sinasc_variables() sinasc_variables(search = "mae") sinasc_variables(search = "parto")
Returns an integer vector with years for which live birth microdata are available from DATASUS FTP.
sinasc_years(status = "final")sinasc_years(status = "final")
status |
Character. Filter by data status. One of:
|
An integer vector of available years.
Other sinasc:
sinasc_cache_status(),
sinasc_clear_cache(),
sinasc_data(),
sinasc_dictionary(),
sinasc_info(),
sinasc_variables()
sinasc_years() sinasc_years(status = "all")sinasc_years() sinasc_years(status = "all")
Shows information about cached SI-PNI data files.
sipni_cache_status(cache_dir = NULL)sipni_cache_status(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
A tibble with cache file information (invisibly).
Other sipni:
sipni_clear_cache(),
sipni_data(),
sipni_dictionary(),
sipni_info(),
sipni_variables(),
sipni_years()
sipni_cache_status()sipni_cache_status()
Deletes cached SI-PNI data files.
sipni_clear_cache(cache_dir = NULL)sipni_clear_cache(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
Invisible NULL.
Other sipni:
sipni_cache_status(),
sipni_data(),
sipni_dictionary(),
sipni_info(),
sipni_variables(),
sipni_years()
sipni_clear_cache()sipni_clear_cache()
Downloads and returns vaccination data from SI-PNI. For years 1994–2019, data is downloaded from DATASUS FTP (aggregated doses/coverage). For years 2020+, data is downloaded from OpenDataSUS as monthly CSV bulk files (individual-level microdata with one row per vaccination dose).
sipni_data( year, type = "DPNI", uf = NULL, month = NULL, vars = NULL, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )sipni_data( year, type = "DPNI", uf = NULL, month = NULL, vars = NULL, parse = TRUE, col_types = NULL, cache = TRUE, cache_dir = NULL, lazy = FALSE, backend = c("arrow", "duckdb") )
year |
Integer. Year(s) of the data. Required. |
type |
Character. File type for FTP data (1994–2019). Default:
|
uf |
Character. Two-letter state abbreviation(s) to download.
If NULL (default), downloads all 27 states.
Example: |
month |
Integer. Month(s) to download (1–12). For years >= 2020 (CSV), selects which monthly CSV files to download. For years <= 2019 (FTP), this parameter is ignored (FTP files are annual). If NULL (default), downloads all 12 months. |
vars |
Character vector. Variables to keep. If NULL (default),
returns all available variables. Use |
parse |
Logical. If TRUE (default), converts columns to
appropriate types (integer, double, Date) based on the variable
metadata. Use |
col_types |
Named list. Override the default type for specific
columns. Names are column names, values are type strings:
|
cache |
Logical. If TRUE (default), caches downloaded data for faster future access. |
cache_dir |
Character. Directory for caching. Default:
|
lazy |
Logical. If TRUE, returns a lazy query object instead of a
tibble. Requires the arrow package. The lazy object supports
dplyr verbs (filter, select, mutate, etc.) which are pushed down
to the query engine before collecting into memory. Call
|
backend |
Character. Backend for lazy evaluation: |
FTP data (1994–2019): Downloaded as plain .DBF files. SI-PNI FTP data is aggregated (dose counts and coverage rates per municipality, vaccine, and age group). Two file types: DPNI (doses) and CPNI (coverage).
CSV data (2020+):
Downloaded from OpenDataSUS as monthly CSV bulk files (national,
semicolon-delimited, latin1 encoding). Each monthly ZIP is ~1.4 GB.
This is individual-level microdata (one row per vaccination dose,
~47 fields per record). The type parameter is ignored for CSV
years. Data is filtered by UF during chunked reading to avoid loading
the full national file into memory.
When downloading multiple files (e.g., several years or states), install
furrr and future and set a parallel plan to speed up downloads:
future::plan(future::multisession, workers = 4). See
vignette("healthbR") for details.
A tibble with vaccination data. Includes columns
year and uf_source to identify the source
when multiple years/states are combined.
Output differs by year range:
1994–2019 (FTP): Aggregated data with DPNI (12 vars) or CPNI (7 vars) columns, all character.
2020+ (CSV): Individual-level microdata with ~47 columns
(snake_case Portuguese), all character. Use
sipni_variables(type = "API") to see the full list.
sipni_info() for type descriptions,
censo_populacao() for population denominators.
Other sipni:
sipni_cache_status(),
sipni_clear_cache(),
sipni_dictionary(),
sipni_info(),
sipni_variables(),
sipni_years()
# FTP: doses applied in Acre, 2019 ac_doses <- sipni_data(year = 2019, uf = "AC") # FTP: vaccination coverage in Acre, 2019 ac_cob <- sipni_data(year = 2019, type = "CPNI", uf = "AC") # API: microdata for Acre, January 2024 ac_api <- sipni_data(year = 2024, uf = "AC", month = 1) # API: select specific variables sipni_data(year = 2024, uf = "AC", month = 1, vars = c("descricao_vacina", "tipo_sexo_paciente", "data_vacina"))# FTP: doses applied in Acre, 2019 ac_doses <- sipni_data(year = 2019, uf = "AC") # FTP: vaccination coverage in Acre, 2019 ac_cob <- sipni_data(year = 2019, type = "CPNI", uf = "AC") # API: microdata for Acre, January 2024 ac_api <- sipni_data(year = 2024, uf = "AC", month = 1) # API: select specific variables sipni_data(year = 2024, uf = "AC", month = 1, vars = c("descricao_vacina", "tipo_sexo_paciente", "data_vacina"))
Returns a tibble with the data dictionary for the SI-PNI FTP data (1994–2019), including variable descriptions and category labels.
sipni_dictionary(variable = NULL)sipni_dictionary(variable = NULL)
variable |
Character. If provided, returns dictionary for a specific variable only. Default: NULL (returns all variables). |
The dictionary covers FTP data variables (DPNI/CPNI, 1994–2019).
API microdata (2020+) has description fields embedded in the data
itself (e.g., descricao_vacina, nome_raca_cor_paciente),
so a separate dictionary is not needed.
A tibble with columns: variable, description, code, label.
Other sipni:
sipni_cache_status(),
sipni_clear_cache(),
sipni_data(),
sipni_info(),
sipni_variables(),
sipni_years()
sipni_dictionary() sipni_dictionary("IMUNO") sipni_dictionary("DOSE")sipni_dictionary() sipni_dictionary("IMUNO") sipni_dictionary("DOSE")
Displays information about the National Immunization Program Information System (SI-PNI), including data sources, available years, file types, and usage guidance.
sipni_info()sipni_info()
A list with module information (invisibly).
Other sipni:
sipni_cache_status(),
sipni_clear_cache(),
sipni_data(),
sipni_dictionary(),
sipni_variables(),
sipni_years()
sipni_info()sipni_info()
Returns a tibble with available variables in the SI-PNI data, including descriptions and value types.
sipni_variables(type = "DPNI", search = NULL)sipni_variables(type = "DPNI", search = NULL)
type |
Character. File type to show variables for.
|
search |
Character. Optional search term to filter variables by name or description. Case-insensitive and accent-insensitive. |
A tibble with columns: variable, description, type, section.
Other sipni:
sipni_cache_status(),
sipni_clear_cache(),
sipni_data(),
sipni_dictionary(),
sipni_info(),
sipni_years()
sipni_variables() sipni_variables(type = "CPNI") sipni_variables(type = "API") sipni_variables(search = "dose")sipni_variables() sipni_variables(type = "CPNI") sipni_variables(type = "API") sipni_variables(search = "dose")
Returns an integer vector with years for which vaccination data are available.
sipni_years()sipni_years()
SI-PNI data is available from two sources:
FTP (1994–2019): Aggregated data (doses applied and coverage) from DATASUS FTP as plain .DBF files.
CSV (2020–2025): Individual-level microdata from OpenDataSUS as monthly CSV bulk downloads (one row per vaccination dose).
An integer vector of available years (1994–2025).
Other sipni:
sipni_cache_status(),
sipni_clear_cache(),
sipni_data(),
sipni_dictionary(),
sipni_info(),
sipni_variables()
sipni_years()sipni_years()
Shows information about cached SISAB data files.
sisab_cache_status(cache_dir = NULL)sisab_cache_status(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
A tibble with cache file information (invisibly).
Other sisab:
sisab_clear_cache(),
sisab_data(),
sisab_info(),
sisab_variables(),
sisab_years()
sisab_cache_status()sisab_cache_status()
Deletes cached SISAB data files.
sisab_clear_cache(cache_dir = NULL)sisab_clear_cache(cache_dir = NULL)
cache_dir |
Character. Cache directory path. Default:
|
Invisible NULL.
Other sisab:
sisab_cache_status(),
sisab_data(),
sisab_info(),
sisab_variables(),
sisab_years()
sisab_clear_cache()sisab_clear_cache()
Downloads and returns primary care coverage data from the SISAB relatorioaps API. Data is aggregated (coverage indicators per geographic unit and period), not individual-level microdata.
sisab_data( year, type = "aps", level = "uf", month = NULL, uf = NULL, vars = NULL, cache = TRUE, cache_dir = NULL )sisab_data( year, type = "aps", level = "uf", month = NULL, uf = NULL, vars = NULL, cache = TRUE, cache_dir = NULL )
year |
Integer. Year(s) of the data. Required. |
type |
Character. Report type to download. Default: |
level |
Character. Geographic aggregation level. Default:
|
month |
Integer. Month(s) to download (1–12). If NULL (default), downloads all 12 months. |
uf |
Character. Two-letter state abbreviation to filter by when
|
vars |
Character vector. Variables to keep. If NULL (default),
returns all available variables. Use |
cache |
Logical. If TRUE (default), caches downloaded data for faster future access. |
cache_dir |
Character. Directory for caching. Default:
|
Data is fetched from the relatorioaps REST API
(https://relatorioaps.saude.gov.br), the public reporting portal
for primary care in Brazil.
Four report types are available:
"aps" (default): APS coverage – number of primary care
teams (eSF, eAP, eSFR, eCR, eAPP) and estimated coverage percentage.
Available from 2019.
"sb": Oral health coverage – dental care teams and
coverage. Available from 2024.
"acs": Community health agents – number of active ACS
and population coverage. Available from 2007.
"pns": PNS coverage – coverage estimates from the
National Health Survey. Available 2020–2023.
For municipality-level data, it is recommended to filter by UF using the
uf parameter to avoid large downloads.
When downloading multiple months, install furrr and future and
set a parallel plan to speed up downloads:
future::plan(future::multisession, workers = 4). See
vignette("healthbR") for details.
A tibble with coverage data. Includes columns year and
type to identify the source when multiple years/types are
combined. Column names are preserved from the API (camelCase).
sisab_info() for report type descriptions,
censo_populacao() for population denominators.
Other sisab:
sisab_cache_status(),
sisab_clear_cache(),
sisab_info(),
sisab_variables(),
sisab_years()
# APS coverage by state, January 2024 sisab_data(year = 2024, month = 1) # National total, full year 2023 sisab_data(year = 2023, level = "brazil") # Oral health coverage by UF sisab_data(year = 2024, type = "sb", month = 6) # Municipality level for Sao Paulo sisab_data(year = 2024, level = "municipality", uf = "SP", month = 1)# APS coverage by state, January 2024 sisab_data(year = 2024, month = 1) # National total, full year 2023 sisab_data(year = 2023, level = "brazil") # Oral health coverage by UF sisab_data(year = 2024, type = "sb", month = 6) # Municipality level for Sao Paulo sisab_data(year = 2024, level = "municipality", uf = "SP", month = 1)
Displays information about the Primary Care Health Information System (SISAB), including data sources, available report types, and usage guidance.
sisab_info()sisab_info()
A list with module information (invisibly).
Other sisab:
sisab_cache_status(),
sisab_clear_cache(),
sisab_data(),
sisab_variables(),
sisab_years()
sisab_info()sisab_info()
Returns a tibble with available variables in the SISAB coverage data, including descriptions and value types.
sisab_variables(type = "aps", search = NULL)sisab_variables(type = "aps", search = NULL)
type |
Character. Report type to show variables for.
|
search |
Character. Optional search term to filter variables by name or description. Case-insensitive and accent-insensitive. |
A tibble with columns: variable, description, type, section.
Other sisab:
sisab_cache_status(),
sisab_clear_cache(),
sisab_data(),
sisab_info(),
sisab_years()
sisab_variables() sisab_variables(type = "sb") sisab_variables(search = "cobertura")sisab_variables() sisab_variables(type = "sb") sisab_variables(search = "cobertura")
Returns an integer vector with years for which SISAB coverage data are potentially available from the relatorioaps API. Actual availability depends on the report type.
sisab_years()sisab_years()
Availability by report type:
aps: APS coverage (2019–present)
sb: Oral health coverage (2024–present)
acs: Community health agents (2007–present)
pns: PNS coverage (2020–2023)
An integer vector of available years.
Other sisab:
sisab_cache_status(),
sisab_clear_cache(),
sisab_data(),
sisab_info(),
sisab_variables()
sisab_years()sisab_years()
Shows cache status including downloaded files and their sizes.
vigitel_cache_status(cache_dir = NULL)vigitel_cache_status(cache_dir = NULL)
cache_dir |
Character. Optional custom cache directory. If NULL (default), uses the standard user cache directory. |
A tibble with cache information
# check cache status vigitel_cache_status()# check cache status vigitel_cache_status()
Removes all cached VIGITEL data files.
vigitel_clear_cache(keep_parquet = FALSE, cache_dir = NULL)vigitel_clear_cache(keep_parquet = FALSE, cache_dir = NULL)
keep_parquet |
Logical. If TRUE, keep parquet cache and only remove source files (ZIP, DTA, CSV). Default is FALSE (remove all). |
cache_dir |
Character. Optional custom cache directory. If NULL (default), uses the standard user cache directory. |
NULL (invisibly)
# remove all cached files from default cache vigitel_clear_cache()# remove all cached files from default cache vigitel_clear_cache()
Downloads and returns VIGITEL survey microdata from the Ministry of Health.
Data is cached locally to avoid repeated downloads. When the arrow package
is installed, data is cached in partitioned parquet format for faster
subsequent reads.
vigitel_data( year = NULL, format = c("dta", "csv"), vars = NULL, cache_dir = NULL, force = FALSE, lazy = FALSE, backend = c("arrow", "duckdb") )vigitel_data( year = NULL, format = c("dta", "csv"), vars = NULL, cache_dir = NULL, force = FALSE, lazy = FALSE, backend = c("arrow", "duckdb") )
year |
Integer or vector of integers. Years to return (2006-2024). Use NULL to return all years. Default is NULL. |
format |
Character. File format to download: "dta" (Stata, default) or "csv". Stata format preserves variable labels. |
vars |
Character vector. Variables to select. Use NULL for all variables. Default is NULL. |
cache_dir |
Character. Directory for caching downloaded files.
Default uses |
force |
Logical. If TRUE, re-download even if file exists in cache. Default is FALSE. |
lazy |
Logical. If TRUE, returns a lazy query object instead of a
tibble. Requires the arrow package. The lazy object supports
dplyr verbs (filter, select, mutate, etc.) which are pushed down
to the query engine before collecting into memory. Call
|
backend |
Character. Backend for lazy evaluation: |
The VIGITEL survey (Vigilância de Fatores de Risco e Proteção para Doenças Crônicas por Inquérito Telefônico) is conducted annually by the Brazilian Ministry of Health in all state capitals and the Federal District.
Data includes information on:
Demographics (age, sex, education, race)
Health behaviors (smoking, alcohol, diet, physical activity)
Health conditions (hypertension, diabetes, obesity)
Healthcare utilization
The survey uses post-stratification weights (variable pesorake) to produce
population estimates. Always use these weights for statistical inference.
When the arrow package is installed, data is cached in partitioned parquet
format. This allows the function to read only the requested years without
loading the entire dataset into memory. If you frequently work with VIGITEL
data, installing arrow is highly recommended:
install.packages("arrow")
A tibble with VIGITEL microdata.
Data is downloaded from the Ministry of Health website:
https://svs.aids.gov.br/daent/cgdnt/vigitel/
# download all years (uses tempdir to avoid leaving files) df <- vigitel_data(cache_dir = tempdir()) # download specific year df_2024 <- vigitel_data(year = 2024, cache_dir = tempdir()) # download multiple years df_recent <- vigitel_data(year = 2020:2024, cache_dir = tempdir()) # select specific variables df_subset <- vigitel_data( year = 2024, vars = c("ano", "cidade", "sexo", "idade", "pesorake"), cache_dir = tempdir() )# download all years (uses tempdir to avoid leaving files) df <- vigitel_data(cache_dir = tempdir()) # download specific year df_2024 <- vigitel_data(year = 2024, cache_dir = tempdir()) # download multiple years df_recent <- vigitel_data(year = 2020:2024, cache_dir = tempdir()) # select specific variables df_subset <- vigitel_data( year = 2024, vars = c("ano", "cidade", "sexo", "idade", "pesorake"), cache_dir = tempdir() )
Downloads and returns the VIGITEL data dictionary containing variable descriptions, codes, and categories.
vigitel_dictionary(cache_dir = NULL, force = FALSE)vigitel_dictionary(cache_dir = NULL, force = FALSE)
cache_dir |
Character. Directory for caching downloaded files.
Default uses |
force |
Logical. If TRUE, re-download even if file exists in cache. Default is FALSE. |
A tibble with variable dictionary.
dict <- vigitel_dictionary(cache_dir = tempdir()) head(dict)dict <- vigitel_dictionary(cache_dir = tempdir()) head(dict)
Returns metadata about the VIGITEL survey.
vigitel_info()vigitel_info()
A list with survey information
vigitel_info()vigitel_info()
Returns a tibble with information about available variables in the VIGITEL dataset.
vigitel_variables(cache_dir = NULL, force = FALSE)vigitel_variables(cache_dir = NULL, force = FALSE)
cache_dir |
Character. Directory for caching downloaded files.
Default uses |
force |
Logical. If TRUE, re-download even if file exists in cache. Default is FALSE. |
A tibble with variable information from the dictionary.
vars <- vigitel_variables(cache_dir = tempdir()) head(vars)vars <- vigitel_variables(cache_dir = tempdir()) head(vars)
Returns a vector of years for which VIGITEL microdata is available for download from the Ministry of Health website.
vigitel_years()vigitel_years()
An integer vector of available years (2006-2024).
vigitel_years()vigitel_years()