Website • Key Features • How To Use • Docs • Examples • Community • Grid AI • Licence
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/pytorch-lightning)](https://pypi.org/project/pytorch-lightning/) [![PyPI Status](https://badge.fury.io/py/pytorch-lightning.svg)](https://badge.fury.io/py/pytorch-lightning) [![PyPI Status](https://pepy.tech/badge/pytorch-lightning)](https://pepy.tech/project/pytorch-lightning) [![Conda](https://img.shields.io/conda/v/conda-forge/pytorch-lightning?label=conda&color=success)](https://anaconda.org/conda-forge/pytorch-lightning) [![DockerHub](https://img.shields.io/docker/pulls/pytorchlightning/pytorch_lightning.svg)](https://hub.docker.com/r/pytorchlightning/pytorch_lightning) [![codecov](https://codecov.io/gh/PyTorchLightning/pytorch-lightning/branch/master/graph/badge.svg)](https://codecov.io/gh/PyTorchLightning/pytorch-lightning) [![ReadTheDocs](https://readthedocs.org/projects/pytorch-lightning/badge/?version=stable)](https://pytorch-lightning.readthedocs.io/en/stable/) [![Slack](https://img.shields.io/badge/slack-chat-green.svg?logo=slack)](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A) [![Discourse status](https://img.shields.io/discourse/status?server=https%3A%2F%2Fforums.pytorchlightning.ai)](https://forums.pytorchlightning.ai/) [![license](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/PytorchLightning/pytorch-lightning/blob/master/LICENSE) [![Next Release](https://img.shields.io/badge/Next%20Release-Nov%2021-Lightning disentangles PyTorch code to decouple the science from the engineering.
Lightning is designed with these principles in mind:
Principle 1: Enable maximal flexibility. Principle 2: Abstract away unnecessary boilerplate, but make it accessible when needed. Principle 3: Systems should be self-contained (ie: optimizers, computation code, etc). Principle 4: Deep learning code should be organized into 4 distinct categories.
Research code (the LightningModule).
Engineering code (you delete, and is handled by the Trainer).
Non-essential research code (logging, etc... this goes in Callbacks).
Data (use PyTorch Dataloaders or organize them into a LightningDataModule).
Once you do this, you can train on multiple-GPUs, TPUs, CPUs and even in 16-bit precision without changing your code!
Get started with our 2 step guide
Lightning is also designed for the fast inference AI researchers and production teams need to scale up things like BERT and self-supervised learning. Lightning can automatically export to ONNX or TorchScript for those cases.
* torch>=1.4
is the minimal pytorch version for Python 3.8
** tests run on two NVIDIA K80
*** tests run on Google GKE TPUv2/3
TPU w/ py3.6/py3.7 means we support Colab and Kaggle env.
Simple installation from PyPI
pip install pytorch-lightning
To get full package experience you can install also all optional dependencies with pytorch-lightning['extra']
or for CPU users with pytorch-lightning['cpu-extra']
.
From Conda
conda install pytorch-lightning -c conda-forge
the actual status of 1.2 [nightly] is following:
Install future release from the source (no guarantees)
pip install git+https://github.com/PytorchLightning/pytorch-lightning.git@release/1.2-dev --upgrade
or nightly from testing PyPI
pip install -iU https://test.pypi.org/simple/ pytorch-lightning
import os import torch from torch import nn import torch.nn.functional as F from torchvision.datasets import MNIST from torch.utils.data import DataLoader, random_split from torchvision import transforms import pytorch_lightning as pl
A LightningModule defines a full system (ie: a GAN, autoencoder, BERT or a simple Image Classifier).
class LitAutoEncoder(pl.LightningModule): def __init__(self): super().__init__() self.encoder = nn.Sequential(nn.Linear(28 * 28, 128), nn.ReLU(), nn.Linear(128, 3)) self.decoder = nn.Sequential(nn.Linear(3, 128), nn.ReLU(), nn.Linear(128, 28 * 28)) def forward(self, x): # in lightning, forward defines the prediction/inference actions embedding = self.encoder(x) return embedding def training_step(self, batch, batch_idx): # training_step defined the train loop. It is independent of forward x, y = batch x = x.view(x.size(0), -1) z = self.encoder(x) x_hat = self.decoder(z) loss = F.mse_loss(x_hat, x) self.log('train_loss', loss) return loss def configure_optimizers(self): optimizer = torch.optim.Adam(self.parameters(), lr=1e-3) return optimizer
Note: Training_step defines the training loop. Forward defines how the LightningModule behaves during inference/prediction.
dataset = MNIST(os.getcwd(), download=True, transform=transforms.ToTensor()) train, val = random_split(dataset, [55000, 5000]) autoencoder = LitAutoEncoder() trainer = pl.Trainer() trainer.fit(autoencoder, DataLoader(train), DataLoader(val))
# 8 GPUs trainer = Trainer(max_epochs=1, gpus=8) # 256 GPUs trainer = Trainer(max_epochs=1, gpus=8, num_nodes=32) # TPUs trainer = Trainer(tpu_cores=8)
# torchscript autoencoder = LitAutoEncoder() torch.jit.save(autoencoder.to_torchscript(), "model.pt") # onnx with tempfile.NamedTemporaryFile(suffix='.onnx', delete=False) as tmpfile: autoencoder = LitAutoEncoder() input_sample = torch.randn((1, 64)) autoencoder.to_onnx(tmpfile.name, input_sample, export_params=True) os.path.isfile(tmpfile.name)
class LitAutoEncoder(pl.LightningModule): def training_step(self, batch, batch_idx, opt_idx): # access your optimizers with use_pl_optimizer=False. Default is True (opt_a, opt_b) = self.optimizers(use_pl_optimizer=True) loss_a = ... self.manual_backward(loss_a, opt_a) opt_a.step() opt_a.zero_grad() loss_b = ... self.manual_backward(loss_b, opt_b, retain_graph=True) self.manual_backward(loss_b, opt_b) opt_b.step() opt_b.zero_grad()
Scale your models to run on any hardware (CPU, GPUs, TPUs) without changing your model
Making code more readable by decoupling the research code from the engineering
Easier to reproduce
Less error prone by automating most of the training loop and tricky engineering
Keeps all the flexibility (LightningModules are still PyTorch modules), but removes a ton of boilerplate
Lightning has out-of-the-box integration with the popular logging/visualizing frameworks (Tensorboard, MLFlow, Neptune.ai, Comet.ml, Wandb).
Tested rigorously with every new PR. We test every combination of PyTorch and Python supported versions, every OS, multi GPUs and even TPUs.
Minimal running speed overhead (about 300 ms per epoch compared with pure PyTorch).
GPU training
Distributed GPU (cluster) training
TPU training
EarlyStopping
Logging/Visualizing
Checkpointing
Experiment management
The lightning community is maintained by - 16 core contributors who are all a mix of professional engineers, Research Scientists, Ph.D. students from top AI labs. - 280+ community contributors.
Lightning is also part of the PyTorch ecosystem which requires projects to have solid testing, documentation and support.
If you have any questions please: 1. Read the docs. 2. Search through the Discussions. 3. Look it up in our forum (or add a new question)4. Join our slack.
Building open-source software with only a few part-time people is hard!
We're venture fundedand backed by some of the top VC funds in the world, Index Ventures, Bain Capital Ventures, First Minute Capital.
Their funding ensures we can continue to build awesome tooling like Grid, give you around the clock support, hire a full-time staff, attend conferences, and move faster through implementing features you request.
To supercharge your research and production work, visit our Grid.ai platform
Grid AI is our native platform for training models at scale on the cloud!
Sign up for early access here
To use grid, take your regular command:
python my_model.py --learning_rate 1e-6 --layers 2 --gpus 4
And change it to use the grid train command:
grid train --grid_gpus 4 my_model.py --learning_rate 'uniform(1e-6, 1e-1, 20)' --layers '[2, 4, 8, 16]'
The above command will launch (20 * 4) experiments each running on 4 GPUs (320 GPUs!) - by making ZERO changes to your code.
Please observe the Apache 2.0 license that is listed in this repository. In addition the Lightning framework is Patent Pending.
If you want to cite the framework feel free to use this (but only if you loved it 😊):
@article{falcon2019pytorch, title={PyTorch Lightning}, author={Falcon, WA}, journal={GitHub. Note: https://github.com/PyTorchLightning/pytorch-lightning}, volume={3}, year={2019} }
代码语言分布