1. 首页
  2. 后端

pytorch dataloader 中的相关参数分析

  1. class DataLoader(object):
  2. r"""
  3. Data loader. Combines a dataset and a sampler, and provides
  4. single- or multi-process iterators over the dataset.
  5. Arguments:
  6. dataset (Dataset): dataset from which to load the data.
  7. batch_size (int, optional): how many samples per batch to load
  8. (default: 1).
  9. shuffle (bool, optional): set to ``True`` to have the data reshuffled
  10. at every epoch (default: False).
  11. sampler (Sampler, optional): defines the strategy to draw samples from
  12. the dataset. If specified, ``shuffle`` must be False.
  13. batch_sampler (Sampler, optional): like sampler, but returns a batch of
  14. indices at a time. Mutually exclusive with batch_size, shuffle,
  15. sampler, and drop_last.
  16. num_workers (int, optional): how many subprocesses to use for data
  17. loading. 0 means that the data will be loaded in the main process.
  18. (default: 0)
  19. collate_fn (callable, optional): merges a list of samples to form a mini-batch.
  20. pin_memory (bool, optional): If ``True``, the data loader will copy tensors
  21. into CUDA pinned memory before returning them.
  22. drop_last (bool, optional): set to ``True`` to drop the last incomplete batch,
  23. if the dataset size is not divisible by the batch size. If ``False`` and
  24. the size of dataset is not divisible by the batch size, then the last batch
  25. will be smaller. (default: False)
  26. timeout (numeric, optional): if positive, the timeout value for collecting a batch
  27. from workers. Should always be non-negative. (default: 0)
  28. worker_init_fn (callable, optional): If not None, this will be called on each
  29. worker subprocess with the worker id (an int in ``[0, num_workers - 1]``) as
  30. input, after seeding and before data loading. (default: None)
  31. .. note:: By default, each worker will have its PyTorch seed set to
  32. ``base_seed + worker_id``, where ``base_seed`` is a long generated
  33. by main process using its RNG. However, seeds for other libraies
  34. may be duplicated upon initializing workers (w.g., NumPy), causing
  35. each worker to return identical random numbers. (See
  36. :ref:`dataloader-workers-random-seed` section in FAQ.) You may
  37. use ``torch.initial_seed()`` to access the PyTorch seed for each
  38. worker in :attr:`worker_init_fn`, and use it to set other seeds
  39. before data loading.
  40. .. warning:: If ``spawn`` start method is used, :attr:`worker_init_fn` cannot be an
  41. unpicklable object, e.g., a lambda function.
  42. """
  43. __initialized = False
  44. def __init__(self, dataset, batch_size=1, shuffle=False, sampler=None, batch_sampler=None,
  45. num_workers=0, collate_fn=default_collate, pin_memory=False, drop_last=False,
  46. timeout=0, worker_init_fn=None):
  47. self.dataset = dataset
  48. self.batch_size = batch_size
  49. self.num_workers = num_workers
  50. self.collate_fn = collate_fn
  51. self.pin_memory = pin_memory
  52. self.drop_last = drop_last
  53. self.timeout = timeout
  54. self.worker_init_fn = worker_init_fn

pin_memory 锁页内存

http://www.voidcn.com/article/p-fsdktdik-bry.html
https://blog.csdn.net/dgh_dean/article/details/53130871

num_workers 工作子线程

本文来自投稿,不代表程序员编程网立场,如若转载,请注明出处:http://www.cxybcw.com/187543.html

联系我们

13687733322

在线咨询:点击这里给我发消息

邮件:1877088071@qq.com

工作时间:周一至周五,9:30-18:30,节假日休息

QR code