I recently attended the HL7 FHIR Connectathon 29. For those that are not familiar with Connectathons, I think they are fairly unique events featuring standards enthusiasts, vendors and implementors doing hands-on standards development (FHIR) and testing. As an attendee I picked one of the tracks – bulk data.
The bulk data track tests the FHIR Bulk Data Access Implementation Guide (IG) – v2.0.0 STU2. For those unfamiliar with the standards process, STU refers to the level of maturity of the specification. This maturity aligns well with the associated ANSI certification process where the highest level is normative
where the "content is considered to be stable and has been ‘locked’". Connectathons test interoperablity and the standards and make the normative/locked in version even more robust.
This particular part of the spec (IG) provides for "efficient access large volumes of information on a group of individuals". Instead of making 100s of 1000s of individual requests, the IG defines an efficient asynchronous process for aggregating the relevant healthcare care data into flat files. These flat files are in the NDJSON format, such as:
{"resourceType":"Patient","name":[{"family":"Doe","given":["John"]}],"birthDate":"1970-01-01"}
{"resourceType":"Patient","name":[{"family":"Doe","given":["Jane"]}],"birthDate":"1960-01-01"}
For the IBM FHIR Server team, I brought our own server to the Connectathon to test one scenario Scenario 2: Bulk data export with retrieval of referenced files on a protected endpoint
. Our team stood up an IBM FHIR Server deployment using Kubernetes and Helm and configured with SMART Backend Services Authorization and IBM Cloud Object Storage, an S3 compatible service.
This blog outlines the recipe to setup the IBM FHIR Server with SMART Backend Services Authorization with Bulk Data. The recipe shows how to side load data into the environment.
1. Setup Prerequisites
In order to complete this setup, you need to setup kubernetes and helm. For my case, I chose to install the ibmcloud
tool as it hosts my Kubernetes deployment.
-
Install the tools:
-
Install the plugins for ibmcloud
When deploying the IBM FHIR Server Edition, you’ll need a few additional plugins than the IBM Cloud default: cloud-object-storage, kubernetes-service, container-registry and the infrastructure-service.
ibmcloud plugin repo-plugins -r "IBM Cloud"
ibmcloud plugin install cloud-object-storage -f
ibmcloud plugin install container-service
ibmcloud plugin install container-registry -f
ibmcloud plugin install infrastructure-service -f
2. Setting up the S3 Bucket with HMAC
The test environment uses bulk data with presigned urls to store the bulk exported data.
- Login with an API Key (much easier if you use SSO)
API_KEY=$(cat cloudpak.json | jq -r .apiKey)
ibmcloud login --apikey ${API_KEY} -r us-east
- Create a Cloud Object Storage Instance, if it does not exist.
ibmcloud resource service-instance-create \
my-bulk-data \
cloud-object-storage standard global
CRN=$(ibmcloud resource service-instance \
my-bulk-data --output JSON | jq -r '.[].crn')
ibmcloud cos config crn --crn "${CRN}"
ibmcloud cos create-bucket --bucket \
"fhir-bulk-data"
ibmcloud resource service-key-create \
test-user-hmac Writer --instance-id "${CRN}" \
--parameters '{"HMAC":true}' --output JSON
- You’ll see a JSON output with
cos_hmac_keys
save this for later.
"cos_hmac_keys": {
"access_key_id": "abcdefgh",
"secret_access_key": "xyzmnopq"
|
The details of the environment can be output:
ibmcloud resource service-instance \
my-bulk-data --output JSON
- Check the endpoints
curl https://control.cloud-object-storage.cloud.ibm.com/v2/endpoints -o endpoints.json
- In the endpoints.json, find the internalUrl (the private) and the externalUrl (the direct) for the location of your Cloud Object Storage, and record it along with the region. Note, I used a regional COS instance.
...
"regional": {
"us-south": {
"public": {
"us-south": "s3.us-south.cloud-object-storage.appdomain.cloud"
},
"private": {
"us-south": "s3.private.us-south.cloud-object-storage.appdomain.cloud"
},
"direct": {
"us-south": "s3.direct.us-south.cloud-object-storage.appdomain.cloud"
}
}
...
- Here is a table for your reference:
Name |
Value |
bucketname |
fhir-bulk-data |
accessKey |
abcdefgh |
secretKey |
xyzmnopq |
region |
us-south |
internalUrl |
s3.private.us-south.cloud-object-storage.appdomain.cloud |
externalUrl |
s3.direct.us-south.cloud-object-storage.appdomain.cloud |
3. Create the Cluster
VPC_ID=$(ibmcloud ks vpcs --provider vpc-gen2 --output json \
| jq -r .[].id)
SUBNET_ID=$(ibmcloud ks subnets --provider vpc-gen2 \
--vpc-id ${VPC_ID} --zone us-east-1 --output json \
| jq -r '.[].id')
ibmcloud oc cluster create vpc-gen2 \
--name demo --flavor bx2.4x16 \
--version 1.23.3 \
--cos-instance ${CRN} \
--service-subnet 172.21.0.0/16 --pod-subnet 172.17.64.0/18 \
--workers 3 --zone us-east-1 --vpc-id=${VPC_ID} \
--subnet-id ${SUBNET_ID}
The IBM Cloud Kubernetes Service has comprehensive documentation at link
If you have questions about which version to check, you can refere to ibmcloud ks versions
or the docs.
Once your cluster is up and operation, where you can login to the Administration console, you are ready to target your deployment to the Cluster.
4. Build and Push the latest from IBM FHIR Server main
Since there are features in that impact Bulk Data support in main, it’s best to push
the latest to a docker registry, and pull the latest into your enviornment.
- Clone the IBM FHIR Server repository and switch to the cloned repository.
git clone https://github.com/IBM/FHIR.git && cd $(basename $_ .git)
- Setup the examples
mvn clean install -f fhir-examples -DskipTests
- Build the fhir projects
mvn clean install -f fhir-parent -DskipTests
- Build the IBM FHIR Server
export BUILD_ID=4.11.0-SNAPSHOT
nerdctl build fhir-install -t prb112/ibm-fhir-server:latest
nerdctl login docker.io
nerdctl push docker.io/prb112/ibm-fhir-server:latest
Now you have the IBM FHIR Server with the latest deployed to a public registry, note, you can always update to work off a private registry using a custom pull secret.
5. Use Helm to deploy the IBM FHIR Server Helm for Smart-on-FHIR access
This helm chart is very comprehensive and includes – Postgres as a Subchart and keycloak with its own Postgres.
- Add the Helm Chart
helm repo add alvearie https://alvearie.io/alvearie-helm
- Update the Helm Chart
$ helm repo update alvearie
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "alvearie" chart repository
Update Complete. ⎈Happy Helming!⎈
- Create a Postgres Password, and save this locally.
export POSTGRES_PASSWORD=$(openssl rand -hex 20)
echo $POSTGRES_PASSWORD
- Configure your
kubectl
for the target cluster
ibmcloud ks cluster config --cluster demo
You see:
OK
The configuration for demo was downloaded successfully.
Added context for m to the current kubeconfig file.
You can now execute 'kubectl' commands against your cluster. For example, run 'kubectl get nodes'.
If you are accessing the cluster for the first time, 'kubectl' commands might fail for a few seconds while RBAC synchronizes.
- Create a namespace for the target deployment.
kubectl create namespace example
namespace/example created
- Setup the TLS Secret on the ibm-provided domain docs
ibmcloud ks cluster get --cluster demo --output JSON | jq .ingress
The output is
{
"hostname": "demo-12345-0000.us-east.containers.appdomain.cloud",
"secretName": "demo-1235-0000",
"status": "healthy",
"message": "All Ingress components are healthy"
}
Copy the secret from the default namespace to the new namespace example
kubectl get secret -n default demo-1235-0000 -o yaml | sed 's/namespace: .*/namespace: example/' | kubectl apply -n example -f -
secret/demo-1235-0000 created
Save the hostname, and secretName for later.
- Setup the Secret for the IBM Cloud container registry docs
kubectl get secret -n default all-icr-io -o yaml | sed 's/namespace: .*/namespace: example/' | kubectl apply -n example -f -
secret/all-icr-io created
Note, this secret is provided in the ibmcloud registry.
- Create an overrides file –
values-example.yml
and update the following values:
Value to Update |
Value to replace below |
Notes |
image.repository |
REPLACE_WITH_YOUR_REPO |
The location of the repository / image – docker.io/prb112/ibm-fhir-server or the recommended default for this case – quay.io/alvearie/fhir-data-access |
postgresql.postgresqlPassword |
REPLACE_WITH_YOUR_POSTGRES_PASSWORD The Postgres Password you generated |
|
keycloak.postgresql.postgresqlPassword |
REPLACE_WITH_YOUR_POSTGRES_PASSWORD |
The Postgres Password you generated |
keycloak.adminPassword |
REPLACE_WITH_YOUR_ADMIN_PASSWORD |
You can pick thisk password |
objectStorage.location |
REPLACE_WITH_COS_REGION |
This is the COS Bucket’s region |
objectStorage.endpointUrl |
REPLACE_WITH_COS_ENDPOINT_URL |
This is the COS endpointURL (the direct one) and is prefixed with https |
objectStorage.accessKey |
REPLACE_WITH_ACCESS_KEY |
The COS HMAC accessKey you created |
objectStorage.secretKey |
REPLACE_WITH_SECRET_KEY |
The COS HMAC secretKey you created |
objectStorage.bulkDataBucketName |
REPLACE_WITH_BUCKET_NAME |
The Bucket you previously created |
ingress.secretName |
REPLACE_SECRET_NAME_FOR_TLS |
The secret name for TLS demo-1235-0000 from above. |
ingress.hostname |
REPLACE_WITH_INGRESS_HOSTNAME |
The hostname recorded from above demo-12345-0000.us-east.containers.appdomain.cloud |
image:
repository: REPLACE_WITH_YOUR_REPO
tag: latest
pullPolicy: Always
ingress:
hostname: "{{ $.Release.Namespace }}.REPLACE_WITH_INGRESS_HOSTNAME"
tls:
- secretName: REPLACE_SECRET_NAME_FOR_TLS
annotations:
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
traceSpec: >-
com.ibm.fhir.smart.*=fine:com.ibm.fhir.server.*=fine
postgresql:
enabled: true
postgresqlPassword: REPLACE_WITH_YOUR_POSTGRES_PASSWORD
nameOverride: postgres
security:
jwtValidation:
enabled: true
oauth:
enabled: true
regUrl: "https://{{ tpl $.Values.ingress.hostname $ }}/auth/realms/test/clients-registrations/openid-connect"
authUrl: "https://{{ tpl $.Values.ingress.hostname $ }}/auth/realms/test/protocol/openid-connect/auth"
tokenUrl: "https://{{ tpl $.Values.ingress.hostname $ }}/auth/realms/test/protocol/openid-connect/token"
smart:
enabled: true
resourceScopes:
- "patient/*.read"
- "patient/AllergyIntolerance.read"
- "patient/CarePlan.read"
- "patient/CareTeam.read"
- "patient/Condition.read"
- "patient/Device.read"
- "patient/DiagnosticReport.read"
- "patient/DocumentReference.read"
- "patient/Encounter.read"
- "patient/ExplanationOfBenefit.read"
- "patient/Goal.read"
- "patient/Immunization.read"
- "patient/Location.read"
- "patient/Medication.read"
- "patient/MedicationRequest.read"
- "patient/MedicationDispense.read"
- "patient/Observation.read"
- "patient/Organization.read"
- "patient/Patient.read"
- "patient/Practitioner.read"
- "patient/PractitionerRole.read"
- "patient/Procedure.read"
- "patient/Provenance.read"
- "patient/RelatedPerson.read"
- "system/*.read"
- "system/AllergyIntolerance.read"
- "system/CarePlan.read"
- "system/CareTeam.read"
- "system/Condition.read"
- "system/Device.read"
- "system/DiagnosticReport.read"
- "system/DocumentReference.read"
- "system/Encounter.read"
- "system/ExplanationOfBenefit.read"
- "system/Goal.read"
- "system/Immunization.read"
- "system/Location.read"
- "system/Medication.read"
- "system/MedicationRequest.read"
- "system/MedicationDispense.read"
- "system/Observation.read"
- "system/Organization.read"
- "system/Patient.read"
- "system/Practitioner.read"
- "system/PractitionerRole.read"
- "system/Procedure.read"
- "system/Provenance.read"
- "system/RelatedPerson.read"
keycloak:
enabled: true
adminUsername: admin
adminPassword: REPLACE_WITH_YOUR_ADMIN_PASSWORD
config:
enabled: true
realms:
test:
clients:
inferno:
consentRequired: true
publicClient: true
redirectURIs:
- "http://localhost:4567/inferno/*"
defaultScopes: []
optionalScopes:
- "patient/*.read"
- "patient/AllergyIntolerance.read"
- "patient/CarePlan.read"
- "patient/CareTeam.read"
- "patient/Condition.read"
- "patient/Device.read"
- "patient/DiagnosticReport.read"
- "patient/DocumentReference.read"
- "patient/Encounter.read"
- "patient/ExplanationOfBenefit.read"
- "patient/Goal.read"
- "patient/Immunization.read"
- "patient/Location.read"
- "patient/Medication.read"
- "patient/MedicationRequest.read"
- "patient/MedicationDispense.read"
- "patient/Observation.read"
- "patient/Organization.read"
- "patient/Patient.read"
- "patient/Practitioner.read"
- "patient/PractitionerRole.read"
- "patient/Procedure.read"
- "patient/Provenance.read"
- "patient/RelatedPerson.read"
infernoBulk:
consentRequired: false
publicClient: false
standardFlowEnabled: false
serviceAccountsEnabled: true
clientAuthenticatorType: client-jwt
defaultScopes: []
optionalScopes:
- "system/*.read"
- "system/AllergyIntolerance.read"
- "system/CarePlan.read"
- "system/CareTeam.read"
- "system/Condition.read"
- "system/Device.read"
- "system/DiagnosticReport.read"
- "system/DocumentReference.read"
- "system/Encounter.read"
- "system/ExplanationOfBenefit.read"
- "system/Goal.read"
- "system/Immunization.read"
- "system/Location.read"
- "system/Medication.read"
- "system/MedicationDispense.read"
- "system/MedicationRequest.read"
- "system/Observation.read"
- "system/Organization.read"
- "system/Patient.read"
- "system/Practitioner.read"
- "system/PractitionerRole.read"
- "system/Procedure.read"
- "system/Provenance.read"
- "system/RelatedPerson.read"
ingress:
enabled: true
rules:
- host: "{{ $.Release.Namespace }}.REPLACE_WITH_INGRESS_HOSTNAME"
paths:
- path: /auth
pathType: Prefix
servicePort: https
tls:
- secretName: REPLACE_SECRET_NAME_FOR_TLS
annotations:
nginx.ingress.kubernetes.io/server-snippet: |
add_header Strict-Transport-Security "max-age=86400; includeSubDomains";
nginx.ingress.kubernetes.io/backend-protocol: HTTPS
nginx.ingress.kubernetes.io/proxy-buffer-size: "64k"
nginx.ingress.kubernetes.io/proxy-ssl-protocols: TLSv1.2 TLSv1.3
postgresql:
postgresqlPassword: REPLACE_WITH_YOUR_POSTGRES_PASSWORD
objectStorage:
enabled: true
location: REPLACE_WITH_COS_REGION
endpointUrl: https://REPLACE_WITH_COS_ENDPOINT_URL
accessKey: REPLACE_WITH_ACCESS_KEY
secretKey: REPLACE_WITH_SECRET_KEY
bulkDataBucketName: REPLACE_WITH_BUCKET_NAME
batchIdEncryptionKey:
The above configuration enables READ only system scopes.
- Upgrade and install
helm upgrade --install ibm-fhir-server alvearie/ibm-fhir-server -f values-pentest.yaml --namespace=example
Note, helm outputs the fhiruser
password and ingress.hostname, save this for later.
- Watch the
pods
until the pods are up in the Running state.
kubectl -n pentest get pods -w
NAME READY STATUS RESTARTS AGE
ibm-fhir-server-7557689c57-mq7zr 0/1 Init:0/1 0 53s
ibm-fhir-server-7557689c57-tfjcl 0/1 Init:0/1 0 54s
ibm-fhir-server-postgres-0 0/1 Pending 0 54s
ibm-fhir-server-schematool-g2wq5 0/1 Init:0/1 0 54s
Then it looks like and wait for the ibm-fhir-server is Running.
ibm-fhir-server-7557689c57-mq7zr 0/1 Init:0/1 0 53s
ibm-fhir-server-7557689c57-tfjcl 0/1 Init:0/1 0 54s
ibm-fhir-server-postgres-0 0/1 Pending 0 54s
ibm-fhir-server-schematool-g2wq5 0/1 Init:0/1 0 54s
ibm-fhir-server-postgres-0 0/1 Pending 0 73s
ibm-fhir-server-postgres-0 0/1 ContainerCreating 0 73s
ibm-fhir-server-postgres-0 0/1 ContainerCreating 0 2m19s
ibm-fhir-server-postgres-0 0/1 Running 0 2m20s
ibm-fhir-server-postgres-0 1/1 Running 0 2m33s
ibm-fhir-server-7557689c57-mq7zr 0/1 PodInitializing 0 2m42s
ibm-fhir-server-7557689c57-mq7zr 0/1 Running 0 2m43s
ibm-fhir-server-7557689c57-tfjcl 0/1 PodInitializing 0 2m44s
ibm-fhir-server-schematool-g2wq5 0/1 PodInitializing 0 2m44s
ibm-fhir-server-7557689c57-tfjcl 0/1 Running 0 2m45s
ibm-fhir-server-schematool-g2wq5 1/1 Running 0 2m45s
ibm-fhir-server-schematool-g2wq5 0/1 Completed 0 3m48s
ibm-fhir-server-schematool-g2wq5 0/1 Completed 0 3m49s
ibm-fhir-server-7557689c57-mq7zr 1/1 Running 0 3m50s
ibm-fhir-server-7557689c57-tfjcl 1/1 Running 0 3m51s
- Check $healthcheck
curl -i -u 'fhiruser:REPLACE_WITH_PASSWORD' 'https://REPLACE_WITH_BASE_URL.containers.appdomain.cloud/fhir-server/api/v4/$healthcheck' -v
< HTTP/2 200
HTTP/2 200
< date: Wed, 19 Jan 2022 16:25:27 GMT
date: Wed, 19 Jan 2022 16:25:27 GMT
< content-length: 0
content-length: 0
< content-language: en-US
content-language: en-US
< strict-transport-security: max-age=15724800; includeSubDomains
strict-transport-security: max-age=15724800; includeSubDomains
6. Login to Keycloak
Keycloak provides the authentication and authorization service for IBM FHIR Server’s implementation of Smart-on-FHIR.
-
Sign in to the Keycloak Console https://REPLACE_WITH_BASE_URL/auth/
using the keycloak.admin
as the user and the keycloak.adminPassword
for the password.
-
You are in the Test
Realm, Click Clients > infernoBulk
-
Select Use JWKS
, enter https://bulk-data.smarthealthit.org/keys/RS384.public.json – note this key is only for testing.
{
"keys": [
{
"kty": "RSA",
"alg": "RS384",
"n": "<<REDACTED>>",
"e": "AQAB",
"key_ops": [
"verify"
],
"use": "sig",
"ext": true,
"kid": "6cf70879258f9c656bb7ccc65802d099"
}
]
}
-
Click Import
-
Click Client Scopes. Under Optional Client Scopes
, if any are specified as system/
, Add selected.
system/*.read
system/AllergyIntolerance.read
system/CarePlan.read
system/CareTeam.read
system/Condition.read
system/Device.read
system/DiagnosticReport.read
system/DocumentReference.read
system/Encounter.read
system/ExplanationOfBenefit.read
system/Goal.read
system/Immunization.read
system/Location.read
system/Medication.read
system/MedicationDispense.read
system/MedicationRequest.read
system/Observation.read
system/Organization.read
system/Patient.read
system/Practitioner.read
system/PractitionerRole.read
system/Procedure.read
system/Provenance.read
system/RelatedPerson.read
-
Click Service Account
. If this is blank, it should prompt you to create the Service Account user.
-
For Service-account-infernobulk
, Click Groups
-
Search available groups for /fhirUser
and add the /fhirUser
to the GroupMembership
You now have a Service Account for SMART Backend Services Authorization for BulkData usage.
7. Side Loading Data
To sideload data, you can use a custom datasource and fhir-server-config.json, and startup a new container from the ibmcom/ibm-fhir-server
image with kubectl installed with ibmcloud tools.
- Start up the container
nerdctl run -p 9443:9443 --name fhir -e BOOTSTRAP_DB=true ibmcom/ibm-fhir-server
docker.io/ibmcom/ibm-fhir-server:latest
- You then
port-forward
to the Kubernetes cluster’s postgres from the container
kubectl port-forward --namespace=example service/ibm-fhir-server-postgres 5432:5432
<server>
<!-- ============================================================== -->
<!-- TENANT: default; DSID: default; TYPE: read-write -->
<!-- ============================================================== -->
<dataSource id="fhirDefaultDefault" jndiName="jdbc/fhir_default_default" type="javax.sql.XADataSource" statementCacheSize="200" syncQueryTimeoutWithTransactionTimeout="true" validationTimeout="30s">
<jdbcDriver javax.sql.XADataSource="org.postgresql.xa.PGXADataSource" libraryRef="sharedLibPostgres"/>
<properties.postgresql
serverName="localhost"
portNumber="5432"
databaseName="fhir"
user="postgres"
password="REPLACE_WITH_YOUR_POSTGRES_PASSWORD"
currentSchema="fhirdata"
/>
<connectionManager maxPoolSize="200" minPoolSize="40"/>
</dataSource>
</server>
- Download the Patient bundle
curl -L https://raw.githubusercontent.com/IBM/FHIR/main/fhir-server-test/src/test/resources/testdata/everything-operation/Antonia30_Acosta403.json -o Antonia30_Acosta403.json
- Check the Patient
curl -u 'fhiruser:change-password' 'https://localhost:9443/fhir-server/api/v4/Patient?_format=application/json&_page=1&_sort=-_lastUpdated'
You should see a single _count
is 1 where a patient is now loaded, and now ready for more comprehensive testing.
A test using the RS384 Key from SMART Health IT and uses the bulk data client to test the environment.
Summary
You have learned more about Connecathon and SMART Health IT with Backend Authorization.
Further information on testing is available at https://bastide.org/2022/01/14/bulk-data-using-the-smart-on-fhir-bulk-data-client-to-test-export/
Trackers/Issues
A lot of interesting points were raised at the Connectathon, and the IBM Team identified a number of issues:
- AccessTokens should not be set with Presigned URLs #3188
- Support BulkData with Expires Header #3185
- Scope warning message for $export is confusing #3182
- $import allows adding Resources of multiple types in the same ndjson which could include unsupported resources. #3180
- fhir-smart Patient/$export assumes no _type filtering leading #3179
- Support subsetting exported resources based on implied SMART-on-FHIR scopes #3177
- Support associating a serviceAccount user with a particular group #33
And a few which we opened with the bulk data client team:
- Bearer token is expected to be capitalized Bearer. #1
- User-Agent string is awkward #3
- Output doesn’t give a lot of details on what resourceType was exported #5
And one we’re watching:
- Provides token even if requiresAccessToken is false #2
And a few which we’ve had on the plan for a while:
- BulkData 2.0.0: _type query parameter’s cardinality is relaxed #3081
- Bulk Data Export 2.0.0: Support the bulkdata patient parameter #1719
We’re also monitoring this issue:
- Provides token even if requiresAccessToken is false #2
I’m looking forward to the next Connectathon and working with you all.
Links
These links are handy for anyone starting out: