Fiber Data Hub: Tractography, Diffusion MRI, and Structural Images for Brain Research

Reference: pdf

Yeh, Fang-Cheng. “DSI Studio: An Integrated Tractography Platform and Fiber Data Hub for Accelerating Brain Research.” Nature Methods, July 2025, https://doi.org/10.1038/s41592-025-02762-8. 

The Fiber Data Hub is a cloud-based platform designed to openly distribute processed fiber data derived from diffusion MRI, enabling scalable and reproducible research in brain connectivity. Currently, the Hub hosts over 40,000 processed fiber datasets, providing comprehensive fiber information such as fiber orientation, anisotropy, diffusivities, and advanced diffusion metrics. These datasets originate from major neuroimaging initiatives, including the Human Connectome Project (HCP), the Adolescent Brain Cognitive Development (ABCD) Study, OpenNeuro, and the International Neuroimaging Data-sharing Initiative (INDI). By offering standardized and compact fiber data, the Hub significantly reduces computational demands, simplifies analytical workflows, and avoids redundant preprocessing. Users can explore data through an intuitive web interface featuring advanced metadata-driven search and built-in quality control measures, promoting collaboration, ensuring consistency, and supporting reproducible neuroscience research.

By consolidating curated and preprocessed fiber datasets from prominent research studies, the Fiber Data Hub enables researchers worldwide to explore brain connectivity without the need for resource-intensive data preparation. Whether studying neurodevelopment, neurological disorders, or population-level brain structure, the Fiber Data Hub offers an invaluable foundation for accelerating discoveries in neuroscience.

📁 File Formats

The Fiber Data Hub provides compact file formats to reduce storage size. These files can be converted back to standard NIfTI format using DSI Studio or Python scripts.

🧠 Fib Files (`*.fz`)

Stores voxel-wise fiber orientation (i.e., fixels), DTI metrics (e.g., anisotropy, diffusivity), and GQI metrics (e.g., QA, ISO).
gqi.fz: data in native space
qsdr.fz: data in MNI space

📦 SRC Files (`*.sz`)

Stores 4D DWI volumes (after eddy/topup correction)
Includes the b-table

📊 Database Files (`*.dz`)

Stores group-level DTI/GQI metrics across subjects in MNI space
Uses the HCP-1065 fiber orientation template for alignment

✅ QC Files (`qc.tsv`)

Tab-separated file with quality control metrics derived from .sz files

Example command line to convert files into NIFTI format

export diffusion metrics as NIFTI files from FIB files

dsi_studio --action=exp --source=subject.gqi.fz --export=dti_fa,md,rd,qa,iso
dsi_studio --action=exp --source=*.gqi.fz --export=dti_fa,md,rd,qa,iso

convert SRC files to 4D NIFTI files and bval/bvec

dsi_studio --action=rec --source=subject.sz --save_nii=subject_dwi.nii.gz
dsi_studio --action=rec --source=*.sz --save_nii=subject_dwi.nii.gz

Repository List

If you would like to suggest a dataset, please feel free to reach out to me (frank.yeh@gmail.com). We will preprocess the data and distribute them

To access the following restricted data (.sz, T1w…etc), please email me your signed NDA data use agreement. Once I receive your signed agreement, I will add you to the user list, enabling you to access the data.

How to Request Restricted Access

update DSI Studio to a version released after June 2025.
launch DSI Studio. When you see the login page, please right-click on the [Registry Entity]. Select [Select All], and then choose [Copy].

Email your registry entity to frank.yeh@gmail.com and attached your NDA aggreement
We will setup your accesss and inform you. Once you restart DSI Studio, you will have access to the restricted folder.
If your registry entity changes, please notify me so that I can update it in the data server.

Download Command

example: download [OWNDER]/[REPO]/[TAG] = data-hcp/lifespan/hcp-ya-retest

curl -s https://api.github.com/repos/data-hcp/lifespan/releases/tags/hcp-ya-retest | jq -r '.assets[].browser_download_url' | xargs -n1 -P4 curl -LO

Example Code

List all data repository (owner/repo/tag/)

import requests
owners = ["data-hcp", "data-nih", "data-openneuro", "data-indi", "data-others"]
def ft(owner, indent=""):
    repos = requests.get(f"https://api.github.com/users/{owner}/repos").json()
    print(f"{indent}{owner}/")
    for repo in repos:
        print(f"{indent}  {repo['name']}/")
        tags = requests.get(f"https://api.github.com/repos/{owner}/{repo['name']}/tags").json()
        [print(f"{indent}    {tag['name']}") for tag in tags if tag]
if __name__ == "__main__":
    [ft(owner) for owner in owners]

Download fib files from repository data-hcp/lifespan/hcp-ya

import requests, os, re; 
owner, repo, tag = "data-hcp", "lifespan", "hcp-ya"; # Define repo details
pattern = r".*\.fz$"; # select all .fz files 
api_url = f"https://api.github.com/repos/{owner}/{repo}/releases/tags/{tag}"; # API endpoint
def dl(u, p):
    try:
        r = requests.get(u, stream=True); r.raise_for_status(); s = int(r.headers.get('content-length', 0)); d = 0;
        with open(p, 'wb') as f:
            for c in r.iter_content(1024): f.write(c); d += len(c);
            print(f"\rDownloaded {os.path.basename(p)}: {(d/s)*100:.2f}%", end='');
        print("\nDone")
    except Exception as e: print(f"Error: {e}")
try:
    assets = requests.get(api_url).json().get('assets', []);
    [dl(asset['browser_download_url'], os.path.join(os.getcwd(), asset['name'])) for asset in assets if re.match(pattern, asset['name'])]
except Exception as e: print(f"Error fetching/processing assets: {e}")

Search data using the QC report

import requests, io, pandas as pd
owner, repo, scan_name = "data-hcp", "lifespan", ""
try: assets = requests.get("https://api.github.com/repos/frankyeh/FiberDataHub/releases/tags/qc-data").json().get("assets", [])
except Exception as e: print(f"Err fetching assets: {e}"); assets = []
all_data = []
for a in assets:
    name = a["name"]
    if name.startswith(owner) and (not repo or repo in name) and name.endswith(".tsv"):
        resp = requests.get(a["browser_download_url"])
        resp.raise_for_status() # Stops if download fails
        if not (df := pd.read_csv(io.StringIO(resp.text), sep="\t", dtype=str)).empty and len(df.columns) > 0:
            filtered_df = df[df.iloc[:, 0].str.contains(scan_name, case=False, na=False)] if scan_name else df
            if not filtered_df.empty: all_data.append(filtered_df)
if all_data: pd.concat(all_data, ignore_index=True).to_csv("result_data.tsv", sep='\t', index=False); print("Saved result_data.tsv")
else: print("No data found.")

The Fiber Data Hub utilizes a versatile storage framework, incorporating multiple decentralized storage locations on GitHub repositories to ensure reliable data access and allow for future expansion. As new studies and datasets become available, the hub’s storage can easily scale to accommodate them, offering an ever-growing resource for the neuroimaging community. Additionally, a centralized web portal at brain.labsolver.org provides alternative access to the hub’s resources, giving researchers flexible options for data retrieval.

Access from DSI Studio

To make data access and analysis as seamless as possible, the Fiber Data Hub is fully integrated with DSI Studio, a comprehensive diffusion MRI and tractography software. Through DSI Studio’s graphical interface, researchers can directly download, inspect, and analyze data from the hub without additional preprocessing, saving time and computational resources. This integration allows researchers to jump-start tractography analyses using advanced tracking methods available in DSI Studio, including deterministic, probabilistic, differential, and correlational tracking.

Population-based dMRI

Tractography Atlases & Connectome

Group-Average Templates

Individual dMRI (HCP)

ABCD (n=~10,000), Adult(n=1065), Aging(n=720), Developmental(n=636), Neonate(n=724), Baby(pending)

Individual dMRI (In-Vivo)

HBN(n=2118),

NKI-RK(n=1248+f),

Cam-CAN(n=634), SWU-SLIM(n=564+f), QTAB(n=415), NTU(n=90), Test-Retest

Individual dMRI (Ex-Vivo)

Duke_Brainstem, MGH_Single_Subject, Histology Images, PRIME-DE(Nonhuman_Primate)

Individual dMRI (Patient)

UPennGBM(n=531)

DSI Studio

User Supports

Discussion Forum

Fiber Data Hub: Tractography, Diffusion MRI, and Structural Images for Brain Research

📁 File Formats

🧠 Fib Files (`*.fz`)

📦 SRC Files (`*.sz`)

📊 Database Files (`*.dz`)

✅ QC Files (`qc.tsv`)

Example command line to convert files into NIFTI format

Repository List

How to Request Restricted Access

Download Command

Example Code

List all data repository (owner/repo/tag/)

Download fib files from repository data-hcp/lifespan/hcp-ya

Search data using the QC report

Access from DSI Studio

Quality Control of All Datasets

Fiber Data Hub: Tractography, Diffusion MRI, and Structural Images for Brain Research

📁 File Formats

🧠 Fib Files (*.fz)

📦 SRC Files (*.sz)

📊 Database Files (*.dz)

✅ QC Files (qc.tsv)

Example command line to convert files into NIFTI format

Repository List

How to Request Restricted Access

Download Command

Example Code

List all data repository (owner/repo/tag/)

Download fib files from repository data-hcp/lifespan/hcp-ya

Search data using the QC report

Access from DSI Studio

Quality Control of All Datasets

🧠 Fib Files (`*.fz`)

📦 SRC Files (`*.sz`)

📊 Database Files (`*.dz`)

✅ QC Files (`qc.tsv`)