'fields'
sevenbridges
is an R/Bioconductor package that provides an interface for Seven Bridges public API. The supported platforms includes the Seven Bridges Platform, Cancer Genomics Cloud (CGC), and Cavatica.
Learn more from our documentation on the Seven Bridges Platform and the Cancer Genomics Cloud (CGC).
The sevenbridges
package only supports v2+ versions of the API, since versions prior to V2 are not compatible with the Common Workflow Language (CWL). This package provides a simple interface for accessing and trying out various methods.
There are two ways of constructing API calls. For instance, you can use low-level API calls which use arguments like path
, query
, and body
. These are documented in the API reference libraries for the Seven Bridges Platform and the CGC. An example of a low-level request to “list all projects” is shown below. In this request, you can also pass query
and body
as a list.
library("sevenbridges")
a <- Auth(token = "8c3329a4de664c35bb657499bb2f335c",
platform = "aws-us")
a$api(path = "projects", method = "GET")
(Advanced user option) The second way of constructing an API request is to directly use the httr
package to make your API calls, as shown below.
a$project()
Before we start, keep in mind the following:
offset
and limit
offset
and limit
.
By default, offset
is set to 0
and limit
is set to 100
. As such, your API request returns the first 100 items when you list items or search for items by name. To search and list all items, use complete = TRUE
in your API request.
Search by ID
When searching by ID, your request will return your exact resource as it is unique. As such, you do not have to manually set offset
and limit
. It is a good practice to find your resources by their ID and pass this ID as an input to your task. You can find a resource’s ID in the final part of the URL on the visual interface or via the API requests to list resources or get a resource’s details.
Search by name
Search by name returns all partial matches unless you specify exact = TRUE
.
The sevenbridges
package is on both the Bioconductor released and devel branches, as shown below.
# try http:// if https:// URLs are not supported
source("https://bioconductor.org/biocLite.R")
biocLite("sevenbridges")
Since we are constantly improving our API and client libraries, please also visit our GitHub repository for most recent news and for latest version of the package.
If you do not have devtools
This installation requires that you have the devtools
package. If you do not have this package, you can install it from CRAN.
install.packages("devtools")
You may got an error and need system dependecies for curl and ssl. For example, in Ubuntu you probably need to do the following first in order to install devtools
and in order to build vignettes since you need pandoc
.
apt-get update
apt-get install libcurl4-gnutls-dev libssl-dev pandoc pandoc-citeproc
If devtools
is already installed
Install the latest version for sevenbridges
from GitHub with the following:
source("https://bioconductor.org/biocLite.R")
biocLite("readr")
devtools::install_github("sbg/sevenbridges-r", build_vignettes = TRUE,
repos = BiocInstaller::biocinstallRepos(),
dependencies = TRUE)
If you have trouble with pandoc
and do not want to install it, set build_vignettes = FALSE
to avoid the vignettes build.
For more details about how to use the API client in R, please consult the Seven Bridges API Reference section below for a complete guide. This section is a Quickstart which walks through the basics of using the R cliet libary to help you get started.
Auth
ObjectBefore you can access your account via the API, you have to provide your credentials. You can obtain your credentials in the form of an “authentication token” from the Developer Tab under Account Settings on the visual interface. Once you’ve obtained this, create an Auth
object so it remembers your authentication token and the path for the API. All subsequent requests will draw upon these two pieces of information.
Let’s load the package first:
library("sevenbridges")
You have three different ways to provide your token. Choose from one method below:
Auth()
.SB_API_ENDPOINT
and SB_AUTH_TOKEN
.$HOME/.sevenbridges/credentials
, provides an organized way to collect and manage all your API authentication information for Seven Bridges platforms.Method 1: Direct authentication
This is the most common method to construct the Auth
object. For example:
(a <- Auth(platform = "cgc", token = "your_token"))
Using platform: cgc
== Auth ==
url : https://cgc-api.sbgenomics.com/v2/
token : <your_token>
Method 2: Environment variables
To set the two environment variables in your system, you could use the function sbg_set_env()
. For example:
sbg_set_env("https://cgc-api.sbgenomics.com/v2", "your_token")
Note that this change might be just temporary, please feel free to use the standard method to set persistent environment variables according to your operation system.
Create an Auth
object:
a <- Auth(from = "env")
Method 3: User configuration file
Assume we have already created the configuration file named credentials
under the directory $HOME/.sevenbridges/
:
[aws-us-tengfei]
api_endpoint = https://api.sbgenomics.com/v2
auth_token = token_for_this_user
# This is a comment:
# another user on the same platform
[aws-us-yintengfei]
api_endpoint = https://api.sbgenomics.com/v2
auth_token = token_for_this_user
[default]
api_endpoint = https://cgc-api.sbgenomics.com/v2
auth_token = token_for_this_user
[gcp]
api_endpoint = https://gcp-api.sbgenomics.com/v2
auth_token = token_for_this_user
To load the user profile aws-us-tengfei
from this configuration file, simply use:
a <- Auth(from = "file", profile_name = "aws-us-tengfei")
If profile_name
is not specified, we will try to load the profile named [default]
:
a <- Auth(from = "file")
Note: API paths (base URLs) differ for each Seven Bridges environment. Be sure to provide the correct path for the environment you are using. API paths for some of the environments are as follows:
Platform | API Base URL | Platform Name |
---|---|---|
Cancer Genomics Cloud (CGC) |
|
|
Seven Bridges Platform on Amazon Web Services (US) |
|
|
Seven Bridges Platform on Amazon Web Services (EU) |
|
|
Seven Bridges Platform on Google Cloud Platform (GCP) |
|
|
Cavatica |
|
|
Please refer to the API reference section for more usage and technical details about the three authentication methods.
This call returns information about your account.
a$user()
== User ==
href : https://cgc-api.sbgenomics.com/v2/users/tengfei
username : tengfei
email : tengfei.yin@sevenbridges.com
first_name : Tengfei
last_name : Yin
affiliation : Seven Bridges Genomics
country : United States
Get information about a user
This call returns information about the specified user. Note that currently you can view only your own user information, so this call is equivalent to the call to get information about your account.
a$user(username = "tengfei")
This call returns information about your current rate limit. This is the number of API calls you can make in one hour.
a$rate_limit()
== Rate Limit ==
limit : 1000
remaining : 993
reset : 1457980957
Each project must have a Billing Group associated with it. This Billing Group pays for the storage and computation in the project.
For example, your first project(s) were created with the free funds from the Pilot Funds Billing Group assigned to each user at sign-up. To get information about billing:
# check your billing info
a$billing()
a$invoice()
For more information, use breakdown = TRUE
.
a$billing(id = "your_billing_id", breakdown = TRUE)
Projects are the core building blocks of the platform. Each project corresponds to a distinct scientific investigation, serving as a container for its data, analysis tools, results, and team of collaborators.
Create a new project called “api testing” with the billing group id
obtained above.
# get billing group id
bid <- a$billing()$id
# create new project
(p <- a$project_new(name = "api testing", bid, description = "Just a test"))
== Project ==
id : tengfei/api-testing
name : api testing
description : Just a test
billing_group_id : <fake_bid>
type : v2
-- Permission --
# list first 100
a$project()
# list all
a$project(complete = TRUE)
# return all named match "demo"
a$project(name = "demo", complete = TRUE)
# get the project you want by id
p <- a$project(id = "tengfei/api-tutorial")
Seven Bridges maintains workflows and tools available to all of its users in the Public Apps repository.
To find out more about public apps, you can do the following:
sevenbridges
package to find it, as shown below.# search by name matching, complete = TRUE search all apps,
# not limited by offset or limit.
a$public_app(name = "STAR", complete = TRUE)
# search by id is accurate
a$public_app(id = "admin/sbg-public-data/rna-seq-alignment-star/5")
# you can also get everything
a$public_app(complete = TRUE)
# default limit = 100, offset = 0 which means the first 100
a$public_app()
Now, from your Auth
object, you copy an App id
into your project
id with a new name
, following this logic.
# copy
a$copy_app(id = "admin/sbg-public-data/rna-seq-alignment-star/5",
project = "tengfei/api-testing", name = "new copy of star")
# check if it is copied
p <- a$project(id = "tengfei/api-testing")
# list apps your got in your project
p$app()
The short name is changed to newcopyofstar
.
== App ==
id : tengfei/api-testing/newcopyofstar/0
name : RNA-seq Alignment - STAR
project : tengfei/api-testing-2
revision : 0
Alternatively, you can copy it from the app
object.
app <- a$public_app(id = "admin/sbg-public-data/rna-seq-alignment-star")
app$copy_to(project = "tengfei/api-testing",
name = "copy of star")
You can also upload your own Common Workflow Language JSON file which describes your app to your project.
Note: Alternatively, you can directly describe your CWL tool in R with this package. Please read the vignette on “Describe CWL Tools/Workflows in R and Execution”.
# add an CWL file to your project
f.star <- system.file("extdata/app", "flow_star.json", package = "sevenbridges")
app <- p$app_add("starlocal", fl.runif)
(aid <- app$id)
You will get an app id
, like the one below:
"tengfei/api-testing/starlocal/0"
It’s composed of the following elements:
tengfei/api
runif
0
Alternatively, you can describe tools in R directly, as shown below:
fl <- system.file("docker", "sevenbridges/rabix/generator.R",
package = "sevenbridges")
cat(readLines(fl), sep = "\n")
library("sevenbridges")
in.lst <- list(
input(id = "number",
description = "number of observations",
type = "integer",
label = "number",
prefix = "--n",
default = 1,
required = TRUE,
cmdInclude = TRUE),
input(id = "min",
description = "lower limits of the distribution",
type = "float",
label = "min",
prefix = "--min",
default = 0),
input(id = "max",
description = "upper limits of the distribution",
type = "float",
label = "max",
prefix = "--max",
default = 1),
input(id = "seed",
description = "seed with set.seed",
type = "float",
label = "seed",
prefix = "--seed",
default = 1))
# the same method for outputs
out.lst <- list(
output(id = "random",
type = "file",
label = "output",
description = "random number file",
glob = "*.txt"),
output(id = "report",
type = "file",
label = "report",
glob = "*.html"))
rbx <- Tool(
id = "runif",
label = "Random number generator",
hints = requirements(docker(pull = "tengfei/runif"), cpu(1), mem(2000)),
baseCommand = "runif.R",
inputs = in.lst, # or ins.df
outputs = out.lst)
fl <- "inst/docker/sevenbridges/rabix/runif.json"
write(rbx$toJSON(pretty = TRUE), fl)
Then, you can add it like this:
# rbx is the object returned by `Tool` function
app <- p$app_add("runif", rbx)
(aid <- app$id)
Please consult another tutorial vignette("apps", "sevenbridges")
about how to describe tools and flows in R.
Once you have copied the public app admin/sbg-public-data/rna-seq-alignment-star/5
into your project, username/api-testing
, the app id
in your current project is username/api-testing/newcopyofstar
. Conversely, you can use another app you already have in your project for this Quickstart.
To draft a new task, you need to specify the following:
id
of the workflow you are executingYou can always check the App details on the visual interface for task input requirements. To find the required inputs with R, you need to get an App
object first.
app <- a$app(id = "tengfei/api-testing-2/newcopyofstar")
# get input matrix
app$input_matrix()
app$input_matrix(c("id", "label", "type"))
# get required node only
app$input_matrix(c("id", "label", "type"), required = TRUE)
Conversely, you can load the app from a CWL JSON and convert it into an R object first, as shown below.
f1 <- system.file("extdata/app", "flow_star.json", package = "sevenbridges")
app <- convert_app(f1)
# get input matrix
app$input_matrix()
## id label type
## 1 #sjdbGTFfile sjdbGTFfile File...
## 2 #fastq fastq File...
## 3 #genomeFastaFiles genomeFastaFiles File
## 4 #sjdbGTFtagExonParentTranscript Exons' parents name string
## 5 #sjdbGTFtagExonParentGene Gene name string
## 6 #winAnchorMultimapNmax Max loci anchors int
## 7 #winAnchorDistNbins Max bins between anchors int
## required fileTypes
## 1 FALSE null
## 2 TRUE null
## 3 TRUE null
## 4 FALSE null
## 5 FALSE null
## 6 FALSE null
## 7 FALSE null
app$input_matrix(c("id", "label", "type"))
## id label type
## 1 #sjdbGTFfile sjdbGTFfile File...
## 2 #fastq fastq File...
## 3 #genomeFastaFiles genomeFastaFiles File
## 4 #sjdbGTFtagExonParentTranscript Exons' parents name string
## 5 #sjdbGTFtagExonParentGene Gene name string
## 6 #winAnchorMultimapNmax Max loci anchors int
## 7 #winAnchorDistNbins Max bins between anchors int
app$input_matrix(c("id", "label", "type"), required = TRUE)
## id label type
## 2 #fastq fastq File...
## 3 #genomeFastaFiles genomeFastaFiles File
Note input_matrix
and output_matrix
are useful accessor for Tool
, Flow
, App
object as shown below in App example, you can directly call these two function on a JSON file as well.
tool.in = system.file("extdata/app", "tool_unpack_fastq.json", package = "sevenbridges")
flow.in = system.file("extdata/app", "flow_star.json", package = "sevenbridges")
input_matrix(tool.in)
input_matrix(tool.in, required = TRUE)
input_matrix(flow.in)
input_matrix(flow.in, c("id", "type"))
input_matrix(flow.in, required = TRUE)
output_matrix(tool.in)
output_matrix(flow.in)
In the response body, locate the names of the required inputs. Note that task inputs need to match the expected data type and name. In the above example, we see two required fields:
We also want to provide a gene feature file:
You can find a list of possible input types below:
File
), other inputs takes more than one file (File
arrays, FilesList
, or ‘File...
’ ). This input require you to pass a Files
object (for a single file input) or FilesList
object (for inputs which accept more than one file) or simply a list in a “Files” object. You can search for your file by id
or by name
with an exact match (exact = TRUE
), as shown in the example below.fastqs <- c("SRR1039508_1.fastq", "SRR1039508_2.fastq")
# get all 2 exact files
fastq_in <- p$file(name= fastqs, exact = TRUE)
# get a single file
fasta_in <- p$file(name = "Homo_sapiens.GRCh38.dna.primary_assembly.fa",
exact = TRUE)
# get all single file
gtf_in <- p$file(name = "Homo_sapiens.GRCh38.84.gtf",
exact = TRUE)
# add new tasks
taskName <- paste0("tengfei_star-alignment ",date())
tsk <- p$task_add(name = taskName,
description = "star test",
app = "tengfei/api-testing-2/newcopyofstar/0",
inputs = list(sjdbGTFfile = gtf_in,
fastq = fastq_in,
genomeFastaFiles = fasta_in))
Remember the fastq
input expects a list of files. You can also do something as follows:
f1 <- p$file(name = "SRR1039508_1.fastq", exact = TRUE)
f2 <- p$file(name = "SRR1039508_2.fastq", exact = TRUE)
# get all 2 exact files
fastq_in <- list(f1, f2)
# or if you know you only have 2 files whose names match SRR924146*.fastq
fastq_in <- p$file(name = "SRR1039508*.fastq", complete = TRUE)
Using complete = TRUE
when items is over 100.
Now let’s do a batch with 8 files in 4 groups, which is batched by metadata sample_id
and library_id
. We will assume each file has these two metadata fields entered. Since these files can be evenly grouped into 4, we will have a single parent batch task with 4 child tasks.
fastqs <- c("SRR1039508_1.fastq", "SRR1039508_2.fastq", "SRR1039509_1.fastq",
"SRR1039509_2.fastq", "SRR1039512_1.fastq", "SRR1039512_2.fastq",
"SRR1039513_1.fastq", "SRR1039513_2.fastq")
# get all 8 files
fastq_in <- p$file(name= fastqs, exact = TRUE)
# can also try to returned all SRR*.fastq files
# fastq_in <- p$file(name= "SRR*.fastq", complete = TRUE)
tsk <- p$task_add(name = taskName,
description = "Batch Star Test",
app = "tengfei/api-testing-2/newcopyofstar/0",
batch = batch(input = "fastq",
criteria = c("metadata.sample_id","metadata.noexist_id")),
inputs = list(sjdbGTFfile = gtf_in,
fastq = fastqs_in,
genomeFastaFiles = fasta_in))
Now you have a draft batch task, please check it out in the visual interface. Your response body should inform you of any errors or warnings.
Note: you can also directly pass file id or file names as characters to inputs list, the package will first guess if the passed strings are file id (24 bit hexdecimal) or names, then convert it to Files or FilesList object. However, as a good practice, we recommend you construct your files object(e.g. p$file(id = , name = )
) first, check the value, then pass it to task_add
inputs, this is a more safe approach.
Now, we are ready to run our task.
# run your task
tsk$run()
Before you run your task, you can adjust your draft task if you have any final modifications. Or, you can delete the draft task if you no longer wish to run it.
# # not run
# tsk$delete()
After you run a task, you can abort it
# abort your task
tsk$abort()
If you want to update your task and then re-run it, follow the example below.
tsk$getInputs()
# missing number input, only update number
tsk$update(inputs = list(sjdbGTFfile = "some new file"))
# double check
tsk$getInputs()
To monitor your task as it runs, you can always request a task update
to ask your task to report its status. Or, you can monitor a running task with a hook function, which triggers the function when that status is “completed”. Please check the details in section below.
tsk$update()
By default, your task alerts you by email when it has been completed.
# Monitor your task (skip this part)
# tsk$monitor()
Use the following to download all files from a completed task.
tsk$download("~/Downloads")
Instead of the default task monitor action, you can use setTaskHook
to connect a function call to the status of a task. When you run tsk$monitor(time = 30)
it will check your task every 30 seconds to see if the current task status matches one of the following statuses: “queued”, “draft”, “running”, “completed”, “aborted”, and “failed”. When it finds a match for the task status, getTaskHook
returns the function call for the specific status.
getTaskHook("completed")
## function (...)
## {
## cat("\\r", "completed")
## return(TRUE)
## }
## <environment: 0x539e038>
If you want to customize the monitor function, you can adjust the following requirement. Your function must return TRUE
or FALSE
in the end. When it is TRUE
(or non-logical value) it means the monitoring will be terminated after it finds a status matched and the function executes, such as when the task is completed. When it is FALSE
it means the monitoring will continue for the next iteration of checking, e.g. when it is “running”, you want to keep tracking.
Follow the example below to set a new function to monitor the status “completed”. Then, when the task is completed, it will download all task output files to local folder.
setTaskHook("completed", function(){
tsk$download("~/Downloads")
return(TRUE)
})
tsk$monitor()
The sevenbridges
package provides a user-friendly interface so you do not have to combine several api()
calls and constantly reference the API documentation to issue API requests.
Before you can interact with the API, you need to construct an Auth
object which stores the following information:
The general authentication logic for Auth()
is as follows:
from
is not specified explicitly or specified as from = "direct"
.from = "env"
, or user configuration file when from = "file"
.To use direct authentication, users need to specify one of platform
or url
, with the corresponding token
. Examples of direct authentication:
a = Auth(token = "1c0e6e202b544030870ccc147092c250",
platform = "aws-us")
The above will use the Seven Bridges Platform on AWS (US).
a = Auth(token = "1c0e6e202b544030870ccc147092c257",
url = "https://gcp-api.sbgenomics.com/v2")
The above will use the specified url
as base URL for the API calls. In this example, the url
points to the Seven Bridges Platform on Google Cloud Platform (GCP).
a = Auth(token = "2c0e6e202b544030870ccc147092c257")
The above will use the Cancer Genomics Cloud environment, since no platform
nor url
were explicitly specified (not recommended).
Note: platform
and url
should not be specified in the same time.
The R API client supports reading authentication information stored in system environment variables.
To set the two environment variables in your system, you could use the function sbg_set_env()
. For example:
sbg_set_env(url = "https://cgc-api.sbgenomics.com/v2",
token = "your_token")
See if the environment variables are correctly set:
sbg_get_env("SB_API_ENDPOINT")
## "https://cgc-api.sbgenomics.com/v2"
sbg_get_env("SB_AUTH_TOKEN")
## "your_token"
To create an Auth
object:
a <- Auth(from = "env")
To unset the two environment variables:
Sys.unsetenv("SB_API_ENDPOINT")
Sys.unsetenv("SB_AUTH_TOKEN")
You can create an ini-like file named credentials
under the folder $HOME/.sevenbridges/
and maintain your credentials for multiple accounts across various Seven Bridges environments. An example:
[aws-us-tengfei]
api_endpoint = https://api.sbgenomics.com/v2
auth_token = token_for_this_user
# This is a comment:
# another user on the same platform
[aws-us-yintengfei]
api_endpoint = https://api.sbgenomics.com/v2
auth_token = token_for_this_user
[default]
api_endpoint = https://cgc-api.sbgenomics.com/v2
auth_token = token_for_this_user
[gcp]
api_endpoint = https://gcp-api.sbgenomics.com/v2
auth_token = token_for_this_user
Please make sure to have two fields exactly named as api_endpoint
and auth_token
under each profile.
To load the default profile (named [default]
) from the default user configuration file ($HOME/.sevenbridges/credentials
), please use:
a <- Auth(from = "file")
To load the user profile aws-us-tengfei
from this configuration file, change the profile_name
:
a <- Auth(from = "file", profile_name = "aws-us-tengfei")
To use a user configuration file from other locations (not recommended), please specify the file path using the argument config_file
. For example:
a <- Auth(from = "file", config_file = "~/sevenbridges.cfg",
profile_name = "aws-us-tengfei")
Note: If you edited the credentials
file, please use Auth()
to re-authenticate.
If you did not pass any parameters to api()
from Auth
, it will list all API calls. Any parameters you provide will be passed to the api()
function, but you do not have to pass your input token and path once more ince the Auth
object already has this information. The following call fromt he Auth
object will check the response as well.
a$api()
offset
and limit
Every API call accepts two arguments named offset
and limit
.
offset
defines where the retrieved items started.limit
defines the quantity of items you want to get.By default, offset
is set to 0
and limit
is set to 50
. As such, your API request returns the first 100 items when you list items or search for items by name. To search and list all items, use complete = TRUE
in your API request.
getOption("sevenbridges")$offset
getOption("sevenbridges")$limit
When searching by ID, your request will return your exact resource as it is unique. As such, you do not have to manually set offset
and limit
. It is a good practice to find your resources by their ID and pass this ID as an input to your task. You can find a resource’s ID in the final part of the URL on the visual interface or via the API requests to list resources or get a resource’s details.
Search by name returns all partial matches unless you specify exact = TRUE
. This type of search will only search across current pulled content, so use complete = TRUE
if you want to search across everything.
For example, to list all public apps, use visibility
argument, but make sure you pass complete = TRUE
to it, to show every single things. This arguments generally works for items like “App”, “Project”, “Task”, “File”, etc.
# first, search by id is fast
x = a$app(visibility = "public", id = "admin/sbg-public-data/sbg-ucsc-b37-bed-converter/1")
# show 100 items from public
x = a$app(visibility = "public")
length(x) # 100
x = a$app(visibility = "public", complete = TRUE)
length(x) # 272 by Nov 2016
# this return nothing, because it is not in the first 100 returned names
a$app(visibility = "public", name = "bed converter")
# this return an app, because it pulls *all* app names and did search
a$app(visibility = "public", name = "bed converter", complete = TRUE)
Similar to offset
and limit
, every API call accepts an argument named advance_access
. This argument was first introduced in August 2017 and controls if a special field in the HTTP request header will be sent, which can enable the access to the “Advance Access” features in the Seven Bridges API. Note that the Advance Access features in the API are not officially released yet, therefore the API usages are subject to change, so please use with discretion.
In addition to modifying each API call that uses Advance Access features, the option can be also set globally at the beginning of your API script. This offers a one-button switch for users who wants to experiment with the Advance Access features. The option is disabled by default:
library("sevenbridges")
getOption("sevenbridges")$advance_access
## [1] FALSE
For example, the “folders” API is in Advance Access as of August, 2017. If we try to use the folders API to create a folder with the advance_access
option disabled, it will return an error message:
req = api(
token = "your_token", path = "files",
method = "POST", body = list(
"name" = "new-folder", "type" = "folder",
"project" = "owner/project"))
httr::content(req)$"message"
## [1] "Advance access feature needs X-SBG-Advance-Access: advance header."
To enable the Advance Access features, one can use
opt = getOption("sevenbridges")
opt$advance_access = TRUE
options(sevenbridges = opt)
at the beginning of their scripts. Let’s check if the option has been enabled:
getOption("sevenbridges")$advance_access
## [1] TRUE
Send the API call again:
req = api(
token = "your_token", path = "files",
method = "POST", body = list(
"name" = "new-folder", "type" = "folder",
"project" = "owner/project"))
The information of the newly created folder will be returned:
httr::content(req)
## $href
## [1] "https://api.sbgenomics.com/v2/files/{folder_id}"
##
## $id
## [1] "folder_id"
##
## $name
## [1] "new-folder"
##
## $project
## [1] "owner/project"
##
## $parent
## [1] "parent_folder_id"
##
## $type
## [1] "folder"
##
## $created_on
## [1] "2017-08-22T12:49:21Z"
##
## $modified_on
## [1] "2017-08-22T12:49:21Z"
'fields'
Please read the documentation detail.
All API calls take the optional query parameter fields. This parameter enables you to specify the fields you want to be returned when listing resources (e.g. all your projects) or getting details of a specific resource (e.g. a given project).
The fields parameter can be used in the following ways:
No fields
parameter specified: calls return default fields. For calls that return complete details of a single resource, this is all their properties; for calls that list resources of a certain type, this is some default properties.
The fields
parameter can be set to a list of fields: for example, to return the fields id, name and size for files in a project, you may issue the call p$file(fields = "id,name,size")
The fields parameter can be used to exclude a specific file: if you wish to omit certain field from the response, do so using the fields parameter with the prefix !
. For example, to get the details of a file without listing its metadata, issue a call p$file(fields = '!metadata')
The entire metadata field will be removed from the response.
The fields parameter can be used to include or omit certain nested fields, in the same way as listed in 2 and 3 above: for example, you can use metadata.sample_id
or origin.task
for files.
To see all fields for a resource, specify fields="_all"
. This returns all fields for each resource returned. Note that if you are getting the details of a specific resource, the use of fields="_all"
won’t return any more properties than would have been shown without this parameter — the use case is instead for when you are listing details of many resources. Please use with care if your resource has particularly large fields; for example, the raw field for an app resource contains the complete CWL specification of the app which can result in bulky response if listing many apps.
Negations and nesting can be combined freely, so, for example, you can issue p$file(fields="id,name,status,!metadata.library,!origin")
or p$task(fields="!inputs,!outputs")
.
Please try following examples
## default fields id, name, project
p$file()
## return file(s) with id, name, siae information
p$file(fields = "id,name,size")
## return file(s) with all available info
p$file(detail = TRUE)
## same as above
p$file(fields = "_all")
This call returns information about your current rate limit. This is the number of API calls you can make in one hour.
a$rate_limit()
This call returns a list of the resources, such as projects, billing groups, and organizations, that are accessible to you. Currently, this call will only return a successful response if {username} is replaced with your own username. Be sure to capitalize your username in the same way as when you registered for an account.
If you did not provide a username, your user information will be shown.
# return your information
a$user()
# return user tengfei's information
a$user("tengfei")
If no billing group id
is provided, this call returns a list of paths used to access billing information via the API. If a username is provided, this call lists all your billing groups, including groups that are pending or which have been disabled. If you specify breakdown = TRUE
, the call below returns a breakdown of the spending per-project for the billing group specified by billing_group. Information is also displayed for each of the projects a particular billing group is associated with, including task executions, their initiating user, start and end time, and their cost.
# return a BillingList object
(b <- a$billing())
a$billing(id = b$id, breakdown = TRUE)
If no Billing Group id
provided, This call returns a list of invoices, with information about each, including whether or not the invoice is pending and the billing period it covers. This call returns information about all your available invoices, unless you use the query parameter bg_id to specify the ID of a particular Billing Group, in which case it will return the invoice incurred by that Billing Group only. If an invoice id
is provided, this call retrieves information about the specified invoice, including the costs for analysis and storage and the invoice period.
a$invoice()
a$invoice(id = "fake_id")
Note: Currently, invoice is not an object. Instead, it just returns a list.
Projects are the basic units to organize different entities: files, tasks, apps, etc. As such, many actions stem from the Project
object.
The following call returns a list of all projects of which you are a member. Each project’s project_id
and path will be returned.
a$project()
If you want to list the projects owned by and accessible to a particular user, specify the owner
argument as follows. Each project’s ID and path will be returned.
a$project(owner = "tengfei")
a$project(owner = "yintengfei")
To get details about project(s), use detail = TRUE
, as shown below.
a$project(detail = TRUE)
For a more friendly interface and convenient search, the sevenbridges
package supports partial name matching. The first argument for the following request is name
.
# want to return a project called
a$project("hello")
To create a new project, users need to specify the following:
name
(required)billing_group_id
(required)description
(optional)tags
(optional): This has to be a list(). If you are using the API on the CGC environment, you can create a TCGA Controlled Data project by specifying TCGA
in tags
.type
(optional): By default, we create a V2, CWL compatible project.a$project_new("api_testing_tcga", b$id,
description = "Test for API")
Follow the directions above, but pass tcga
as a value for tags
.
a$project_new("controlled_project", b$id,
description = "Test for API", tags = list("tcga"))
You can delete a single project by making the request to $delete()
. Note that the returned object from a$project()
sometimes returns list if you use partial matching by name. The $delete()
request cannot operate on a list. If you want to operate on a list of object, read more about batch functions in the relevant section below.
# remove it, not run
a$project("api_testing")$delete()
# check
# will delete all projects match the name
delete(a$project("api_testing_donnot_delete_me"))
You can update the following information about an existing project:
name
description
billing_group_id
a$project(id = "tengfei/helloworld")
a$project(id = "tengfei/helloworld")$update(name = "Hello World Update",
description = "Update description")
This call returns a list of the members of the specified project. For each member, the response lists:
a$project(id = "tengfei/demo-project")$member()
This call adds a new user to the specified project. It can only be made by a user who has admin permissions in the project.
Requests to add a project member must include the key permissions
. However, if you do not include a value, the member’s permissions will be set to false
by default.
Set permission by passing the following values: copy
, write
, execute
, admin
, or read
.
Note: read
is implicit and set by default. You can not be a project member without having read
permissions.
m <- a$project(id = "tengfei/demo-project")$
member_add(username = "yintengfei")
This call edits a user’s permissions in the specified project. It can only be made by a user who has admin permissions in the project.
m <- a$project(id = "tengfei/demo-project")$
member(username = "yintengfei")
m$update(copy = TRUE)
== Member ==
username : yintengfei
-- Permission --
read : TRUE
write : FALSE
copy_permission : TRUE
execute : FALSE
admin : FALSE
To delete an existing member, just call the delete()
action on the Member
object.
m$delete()
# confirm
a$project(id = "tengfei/demo-project")$member()
To list all files belonging to a project, use the following request:
p <- a$project(id = "tengfei/demo-project")
p$file()
All Seven Bridges environments support the Common Workflow Language (CWL) natively to allow for reproducible and portable workflows and tools. In this section, we will work with apps, including tools and workflows, via the API using the sevenbridges
R package.
This call lists all the apps available to you.
a$app()
# or show details
a$app(detail = TRUE)
Search for an app by name
To search for a specific app by its name, pass a pattern for the name
argument or provide a unique id
as shown below.
# pattern match
a$app(name = "STAR")
# unique id
aid <- a$app()[[1]]$id
aid
a$app(id = aid)
# get a specific revision from an app
a$app(id = aid, revision = 0)
List all apps in a project
To list all apps belonging to a specific project, use the project
argument.
# my favorite, always
a$project("demo")$app()
# or alternatviely
pid <- a$project("demo")$id
a$app(project = pid)
List all public apps
To list all public apps, use the visibility
argument.
# show 100 items from public
x <- a$app(visibility = "public")
length(x)
x <- a$app(visibility = "public", complete = TRUE)
length(x)
x <- a$app(project = "tengfei/helloworld", complete = TRUE)
length(x)
a$app(visibility = "public", limit = 5, offset = 150)
Search through all public apps in all locations
To search for a public app cross all locations, make the following call. Note that this may take a bit of time.
a$app("STAR", visibility = "public", complete = TRUE)
This call copies the specified app to a specified project. The app should be one in a project that you can access, either an app which has been uploaded by a project member or a publicly availble app which has been copied to the project.
This call requires the following two arguments:
project
: include the project idname
(optional): use this field to optionally re-name your appaid <- a$public_app()[[1]]$id
a$copy_app(aid, project = pid, name = "copy-rename-test")
# check if it is copied
a$app(project = pid)
You can also copy directly from the app object, as shown below.
app <- a$public_app(id = "admin/sbg-public-data/rna-seq-alignment-star")
app$copy_to(project = "tengfei/api-testing",
name = "copy of star")
This call returns the raw CWL of a specific app, including its raw CWL. The call differs from the call to GET details of an app in that it returns a JSON object that is the CWL.
The app should be one in a project that you can access, either an app which has been uploaded by a project member or a publicly availble app which has been copied to the project.
To get a specific revision, pass the revision
argument.
ap <- a$app(visibility = "public")[[1]]
a$project("demo")$app("index")
# get a specific revision
a$project("demo")$app("index", revision = 0)
Coming soon: converting apps to CWL objects
To add a CWL object as an app, use the app_add
function call for a Project
object. The following two parameters are required:
short_name
: Project short names are based on the name you give to a project when you create it. Learn more about project short names on the Seven Bridges Platform or the CGC.filename
: The name of the JSON file containing the CWL.cwl.fl <- system.file("extdata", "bam_index.json", package = "sevenbridges")
a$project("demo")$app_add(short_name = "new_bam_index_app", filename = cwl.fl)
a$project("demo")$app_add(short_name = "new_bam_index_app", revision = 2, filename = cwl.fl)
Note: If you provide the same short_name
, this will add a new revision.
This is introduced in another vignette (vignette("apps", "sevenbridges")
).
On sevenbridges platform, when you create or update your tools in the GUI, there is a test tab allow users to tweak the parameters and see what it looks like in your commandline simulatd terminal. To do this via R when you push your Tool object to your project, you need to provide “sbg:job” information like the example shown below
rbx <- Tool(id = "runif",
label = "Random number generator",
hints = requirements(docker(pull = "tengfei/runif"),
cpu(1), mem(2000)),
baseCommand = "runif.R",
inputs = in.lst,
outputs = out.lst,
'sbg:job' = list(allocatedResources = list(mem = 9000, cpu = 1),
inputs = list(min = 1, max = 150)))
p$app_add("random", rbx)
Or if you have created a test info on the platform or previously pushed one, and you want to keep the previous test setup. We provide a arguments called keep_test
to allow you keep the previous revision’s test information.
rbx <- Tool(id = "runif",
label = "Random number generator",
hints = requirements(docker(pull = "tengfei/runif"),
cpu(1), mem(2000)),
baseCommand = "runif.R",
inputs = in.lst,
outputs = out.lst)
p$app_add("random", rbx, keep_test = TRUE)
This call returns a list of tasks that you can access. You are able to filter tasks by status.
# all tasks
a$task()
# filter
a$task(status = "completed")
a$task(status = "running")
To list all tasks in a project, use the following.
# a better way
a$project("demo")$task()
# alternatively
pid <- a$project("demo")$id
a$task(project = pid)
To list all tasks with details, pass detail = TRUE
.
p$task(id = "your task id here", detail = TRUE)
p$task(detail = TRUE)
To list a batch task using the parent
parameter, pass the batch parent task id.
p <- a$project(id = "tengfei/demo")
p$task(id = "2e1ebed1-c53e-4373-870d-4732acacbbbb")
p$task(parent = "2e1ebed1-c53e-4373-870d-4732acacbbbb")
p$task(parent = "2e1ebed1-c53e-4373-870d-4732acacbbbb", status = "completed")
p$task(parent = "2e1ebed1-c53e-4373-870d-4732acacbbbb", status = "draft")
To create a draft task, you need to call the task_add
function from the Project object. And you need to pass the following arguments:
name
: The name for this taskdescription
(optional): A description for this taskapp
: The app idinputs
: A list of inputs for this task# push an app first
fl.runif <- system.file("extdata", "runif.json", package = "sevenbridges")
a$project("demo")$app_add("runif_draft", fl.runif)
runif_id <- "tengfei/demo-project/runif_draft"
# create a draft task
a$project("demo")$task_add(name = "Draft runif 3",
description = "Description for runif 3",
app = runif_id,
inputs = list(min = 1, max = 10))
# confirm
a$project("demo")$task(status = "draft")
Call the update
function from a Task object to update the following:
name
description
inputs
list. Note that you can only update the items you provided.# get the single task you want to update
tsk <- a$project("demo")$task("Draft runif 3")
tsk
tsk$update(name = "Draft runif update",
description = "draft 2",
inputs = list(max = 100))
# alternative way to check all inputs
tsk$getInputs()
This call runs (“executes”) the specified task. Only tasks with a “DRAFT” status may be run.
tsk$run()
# run update without information just return latest information
tsk$update()
### Monitor a running task and set the hook function
To monitor a running task, call monitor
from a Task object.
tsk$monitor()
Get and set default a hook function for task status. Currently, failed tasks will break the monitoring.
Note: Hook function has to return TRUE
(break monitoring) or FALSE
(continuing) in the end.
getTaskHook("completed")
getTaskHook("draft")
setTaskHook("draft", function(){message("never happens"); return(TRUE)})
getTaskHook("draft")
This call aborts the specified task. Only tasks whose status is “RUNNING” may be aborted.
# abort
tsk$abort()
# check
tsk$update()
Note that you can only delete DRAFT tasks, not running tasks.
tsklst <- a$task(status = "draft")
# delete a single task
tsklst[[1]]$delete()
# confirm
a$task(status = "draft")
# delete a list of tasks
delete(tsklst)
To run task in batch mode, (check ?batch
) for more details. The code below is a mock.
# batch by items
(tsk <- p$task_add(name = "RNA DE report new batch 2",
description = "RNA DE analysis report",
app = rna.app$id,
batch = batch(input = "bamfiles"),
inputs = list(bamfiles = bamfiles.in,
design = design.in,
gtffile = gtf.in)))
# batch by metadata, input files has to have metadata fields specified
(tsk <- p$task_add(name = "RNA DE report new batch 3",
description = "RNA DE analysis report",
app = rna.app$id,
batch = batch(input = "fastq",
c("metadata.sample_id", "metadata.library_id")),
inputs = list(bamfiles = bamfiles.in,
design = design.in,
gtffile = gtf.in)))
Cloud storage providers come with their own interfaces, features and terminology. At a certain level, though, they all view resources as data objects organized in repositories. Authentication and operations are commonly defined on those objects and repositories, and while each cloud provider might call these things different names and apply different parameters to them, their basic behavior is the same.
Seven Bridges environments mediate access to these repositories using volumes. A volume is associated with a particular cloud storage repository that you have enabled Seven Bridges to read from (and, optionally, to write to). Currently, volumes may be created using two types of cloud storage repository: Amazon Web Services’ (AWS) S3 buckets and Google Cloud Storage (GCS) buckets.
A volume enables you to treat the cloud repository associated with it as external storage. You can ‘import’ files from the volume to your Seven Bridges environment to use them as inputs for computation. Similarly, you can write files from the Seven Bridges environment to your cloud storage by ‘exporting’ them to your volume.
Learn more about volumes on the Seven Bridges Platform and the CGC.
a <- Auth(user = "tengfei", platform = "aws-us")
a$add_volume(name = "tutorial_volume",
type = "s3",
bucket = "tengfei-demo",
prefix = "",
access_key_id = "AKIAJQENSIA4DJQNZO3A",
secret_access_key = "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",
sse_algorithm = "AES256",
access_mode = "RW")
# list all volume
a$volume()
# get unique volume by id
a$volume(id = "tengfei/tengfei_demo")
# partial search by name
a$volume(name = "demo")
This call deletes a volume you’ve created to refer to storage on Amazon Web Services or Google Cloud Storage.
Note that any files you’ve imported from your volume onto a Seven Bridges environment, known as an alias, will no longer be usable. If a new volume is created with the same volume_id as the deleted volume, aliases will point to files on the newly created volume instead (if those exist).
a$volume(id = "tengfei/tengfei_demo")$delete()
This call imports a file from volume to your project.
v <- a$volume(id = "tengfei/tutorial_volume")
res <- v$import(location = "A-RNA-File.bam.bai",
project = "tengfei/s3tutorial",
name = "new.bam.bai",
overwrite = TRUE)
# get job status update
# state will be "COMPLETED" when it's finished, otherwise "PENDING"
v$get_import_job(res$id)
v
Important :
When testing please update your file in a project.
res <- v$export(file = "579fb1c9e4b08370afe7903a",
volume = "tengfei/tutorial_volume",
location = "", # when "" use old name
sse_algorithm = "AES256")
# get job status update
# state will be "COMPLETED" when it's finished other wise "PENDING"
v$get_export_job(res$id)
v
Seven Bridges hosts publicly accessible files and apps on its environments. The sevenbridges
package provides two easy function calls from the Authentication object to search for and copy files and apps to a project.
# list the first 100 files
a$public_file()
# list by offset and limit
a$public_file(offset = 100, limit = 100)
# simply list everything!
a$public_file(complete = TRUE)
# get exact file by id
a$public_file(id = "5772b6f0507c175267448700")
# get exact file by name with exact = TRUE
a$public_file(name = "G20479.HCC1143.2.converted.pe_1_1Mreads.fastq", exact = TRUE)
# with exact = FALSE by default search by name pattern
a$public_file(name = "fastq")
a$public_file(name = "G20479.HCC1143.2.converted.pe_1_1Mreads.fastq")
Public files are hosted in the project called admin/sbg-public-data
, and you can alternatively use the file
request to get files you need.
For public apps, there are similar API calls.
# list for 100 apps
a$public_app()
# list by offset and limit
a$public_app(offset = 100, limit = 50)
# search by id
a$public_app(id = "admin/sbg-public-data/control-freec-8-1/12")
# search by name in ALL apps
a$public_app(name = "STAR", complete = TRUE)
# search by name with exact match
a$public_app(name = "Control-FREEC", exact = TRUE, complete = TRUE)
In the easy API, we return an object which contains the raw response from httr
as a field. You can either call response()
on that object or just use the field as is.
Currently, users have to use lapply
to do those operations themselves. It’s a simple implementation.
In this package, we implement delete
and download
for some objects like task, project, or file.
Quick cheat sheet (in progress)
## Authentication
getToken()
a <- Auth(token = token)
a <- Auth(token = token, platform = "cgc")
a <- Auth(from = "env")
a <- Auth(from = "file", profile_name = "aws-us-tengfei")
## list API
a$api()
## Rate limits
a$rate_limit()
## Users
a$user()
a$user("tengfei")
## billing
a$billing()
a$billing(id = , breakdown = TRUE)
a$invoice()
a$invoice(id = "fake_id")
## Project
### create new project
a$project_new(name = , billing_group_id = , description = )
### list all project owned by you
a$project()
a$project(owner = "yintengfei")
### partial match
p <- a$project(name = , id = , exact = TRUE)
### delete
p$delete()
### update
p$update(name = , description = )
### members
p$member()
p$member_add(username = )
p$member(username = )$update(write = , copy = , execute = )
p$memeber(usrname = )$delete()
## file
### list all files in this project
p$file()
### list all public files
a$file(visibility = "public")
### copy
a$copyFile(c(fid, fid2), project = pid)
### delete
p$file(id = fid)$delete()
### download
p$file()[[1]]$download_url()
p$file(id = fid3)$download("~/Downloads/")
### download all
download(p$file())
### update a file
fl$update(name = , metadata = list(a = , b = , ...))
### meta
fl$meta()
fl$setMeta()
fl$setMeta(..., overwrite = TRUE)
## App
a$app()
### apps in a project
p$app()
p$app(name, id, revision = )
a$copy_app(aid, project = pid, name = )
### add
p$app_add(short_name = , filename =)
## Task
a$task()
a$task(name = , id = )
a$task(status = )
p$task()
p$task(name = , id = )
p$task(status = )
tsk <- p$task(name = , id = )
tsk$update()
tsk$abort()
tsk$run()
tsk$download()
tsk$detele()
tsk$getInputs()
tsk$monitor()
getTaskHook()
setTaskHook(status = , fun = )