SDK Reporting

SDK reporting is powerful, but it requires a special expertise to use it to its full extent. SDK leverages the R statistical computing and graphics language. You can create simple charts, tables, and reports or do complex machine learning and data analysis with algorithms.

Due to the learning curve of SDK reporting, we provide two ways to get the most out of you data without learning a statistical language:

Custom Reports

If you need a custom report, please contact us. Based on the level of effort, the report creation may fit within your current contract. We may also create reports at no additional cost if the report is of a general nature that other SDK clients can use.

General Reports

Dataset Outliers Report:

Purpose: Used to identify fields values that are potentially incorrect for project managers to review

Summary: For any fields that are identified as numeric and linear (as opposed to numeric values representing select options), upper and lower limits are established to define outliers beyond those limits. Reports will be run daily on a project basis by examining all active datasets (have received new records or changes since last report) with at least 1,000 records. Upper and lower limits are determined by using an interquartile multiplier of 3 by default similar to the way extreme outliers are defined in boxplots.

Report Format: The report that is generated will be emailed to project managers and will include a set of unique values identified as outliers and its field name for each dataset in a project.

Box and whisker plot 027531f42b67352beb5b66e1209bc25fb5fc84b96e3bc04e82a06e0f064dae2e

Data Overview:

Purpose: Offers a high level summary of values in each field for each dataset in a project.

Summary: The summary will depend on the data type for the field. For numeric fields, a histogram displaying the distribution will be shown along with the median, min, max, non-blank count and empty count. For categorical fields (select options used in the form), a bar chart will be displayed along with the most common value and counts for non-blank and empty fields. For String fields, a sample value is shown along with a unique value count and counts for non-blank and empty fields.


Custom R & SQL Report

If you are familiar with R or SQL, you can write your own reports and automate the running of those reports. Documentation on how to do that can be found below, but you may have additional questions that we are happy to answer. As always, we provide friendly support.


Executing SQL statements:

Purpose: Allow users to run SQL queries that have access to tables within their projects.

Summary: Users can write and execute read-only SQL statements that reference any tables within their projects. The results are returned in HTML format either on demand or scheduled email.


R Markdown Reports:

Purpose: Allow users to upload custom R markdown reports.

Summary: Users can upload custom R reports that have access to any datasets in their projects. Once a report is uploaded, it can run on demand or on schedule and be sent to a predefined set of project users.

This is a basic example of a report that selects all records from a specific table.

Download R Markdown template


  queryResults <- dbSendQuery(con, "select * from table_name_here")
  fetch(queryResults)

If an R report is uploaded (with a .R file extension), then the following format is expected and the results will be wrapped in an R markdown report for viewing.

Example output:

Skitch 64231cd0adfe8465e136ee7147e8491c390cd1a2ad5bb15695bf994b53e13ace

If an R markdown report is uploaded (with a .Rmd file extension), then an html file of the report output is returned. With this option, you can format the report in any way you see fit, however you MUST include the con variable in order to connect to the database.


Sample Rmd report script:

The con variable must be present to connect to the SDK database.


  ---
  title: "R Markdown Example"
  output: html_document
  ---

  ```
  {r message=FALSE, warning=FALSE, echo=FALSE, results='hide'}
  library(RMySQL)
  library(plyr)
  library(ggplot2)

  fields <- 'winner'
  table.name <- 'ping_pong_matches'
  sql.statement <- paste('select', fields, 'from', table.name)
  query.results <- dbSendQuery(con, sql.statement)

  # retrieve and clean data
  dataset <- fetch(query.results, n = -1)
  winners <- dataset$winner
  winners <- toupper(winners[winners != "\"quote"])

  counts <- count(winners)
  names(counts) <- c('name', 'wins')
  ```

  ## Sample Report
  Example of a custom R markdown report

  ```
  {r}
  print(counts)

  ggplot(data.frame(winners), aes(x=winners)) + geom_bar() + 
    theme(axis.text.x = element_text(angle = 90, hjust = 1))
  ```

Sample Rmd report output:

Skitch 1 d48d2be6587f814cff98173f8b56a2b4d38c7506d89d9141721b6090c8382464

Report Scheduling:

Purpose: Allow users to control the recipients and timing of scheduled reports.

Summary: After a report is saved, a project admin can configure which project members will receive the report and also which days of the week it will be delivered.

Get Friendly Support