hwloc.7 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359
  1. .\" -*- nroff -*-
  2. .\" Copyright © 2010-2020 Inria. All rights reserved.
  3. .\" Copyright © 2010 Université of Bordeaux
  4. .\" Copyright © 2009-2010 Cisco Systems, Inc. All rights reserved.
  5. .\" See COPYING in top-level directory.
  6. .TH HWLOC "7" "Sep 07, 2023" "2.9.3" "hwloc"
  7. .SH NAME
  8. hwloc - General information about hwloc ("hardware locality").
  9. .
  10. .\" **************************
  11. .\" Description Section
  12. .\" **************************
  13. .SH DESCRIPTION
  14. .
  15. hwloc provides command line tools and a C API to obtain the
  16. hierarchical map of key computing elements, such as: NUMA memory
  17. nodes, shared caches, processor packages, processor cores, and
  18. processor "threads". hwloc also gathers various attributes such as
  19. cache and memory information, and is portable across a variety of
  20. different operating systems and platforms.
  21. .
  22. .
  23. .SS Definitions
  24. hwloc has some specific definitions for terms that are used in this
  25. man page and other hwloc documentation.
  26. .
  27. .TP 5
  28. .B hwloc CPU set:
  29. A set of processors included in an hwloc object, expressed as a bitmask
  30. indexed by the physical numbers of the CPUs (as announced by the OS).
  31. The hwloc definition
  32. of "CPU set" does not carry any of the same connotations as Linux's "CPU
  33. set" (e.g., process affinity, cgroup, etc.).
  34. .
  35. .TP
  36. .B hwloc node set:
  37. A set of NUMA memory nodes near an hwloc object, expressed as a bitmask
  38. indexed by the physical numbers of the NUMA nodes (as announced by the OS).
  39. .
  40. .TP
  41. .B Linux CPU set:
  42. See http://www.mjmwired.net/kernel/Documentation/cpusets.txt for a
  43. discussion of Linux CPU sets. A
  44. super-short-ignoring-many-details description (taken from that page)
  45. is:
  46. .br
  47. .br
  48. "Cpusets provide a mechanism for assigning a set of CPUs and Memory
  49. Nodes to a set of tasks."
  50. .
  51. .TP
  52. .B Linux Cgroup:
  53. See http://www.mjmwired.net/kernel/Documentation/cgroups.txt for a
  54. discussion of Linux control groups. A
  55. super-short-ignoring-many-details description (taken from that page)
  56. is:
  57. .br
  58. .br
  59. "Control Groups provide a mechanism for aggregating/partitioning sets
  60. of tasks, and all their future children, into hierarchical groups
  61. with specialized behaviour."
  62. .
  63. .PP
  64. To be clear, hwloc supports all of the above concepts. It is simply
  65. worth noting that they are different things.
  66. .
  67. .SS Location Specification
  68. .
  69. Locations refer to specific regions within a topology. Before reading
  70. the rest of this man page, it may be useful to read lstopo(1) and/or
  71. run lstopo on your machine to see the reported topology tree. Seeing
  72. and understanding a topology tree will definitely help in
  73. understanding the concepts that are discussed below.
  74. .
  75. .PP
  76. Locations can be specified in multiple ways:
  77. .
  78. .TP 10
  79. .B Tuples:
  80. Tuples of hwloc "objects" and associated indexes can be specified in
  81. the form
  82. .IR object:index .
  83. hwloc objects represent types of mapped items (e.g., packages, cores,
  84. etc.) in a topology tree; indexes are non-negative integers that
  85. specify a unique physical object in a topology tree. Both concepts
  86. are described in detail, below.
  87. .br
  88. .br
  89. Indexes may also be specified as ranges.
  90. \fIx-y\fR enumerates from index \fIx\fR to \fIy\fR.
  91. \fIx:y\fR enumerates \fIy\fR objects starting from index \fIx\fR
  92. (wrapping around the end of the index range if needed).
  93. \fIx-\fR enumerates all objects starting from index \fIx\fR.
  94. \fIall\fR, \fIodd\fR, and \fIeven\fR are also supported for listing
  95. all objects, or only those with odd or even indexes.
  96. .br
  97. .br
  98. Chaining multiple tuples together in the more general form
  99. .I object1:index[.object2:index2[...]]
  100. is permissable. While the first tuple's object may appear anywhere in
  101. the topology, the Nth tuple's object must have a shallower topology
  102. depth than the (N+1)th tuple's object. Put simply: as you move right
  103. in a tuple chain, objects must go deeper in the topology tree.
  104. When using logical indexes (which is the default),
  105. indexes specified in chained tuples are relative to the scope of the
  106. parent object. For example, "package:0.core:1" refers to the second
  107. core in the first package.
  108. .br
  109. .br
  110. When using OS/physical indexes, the first object matching the given
  111. index is used.
  112. .br
  113. .br
  114. PCI and OS devices may also be designed using their identifier.
  115. For example, "\fBpci=02:03.1\fR" is the PCI device with bus ID "02:03.1".
  116. .
  117. "\fBos=eth0\fR" is the network interface whose software name is "eth0".
  118. .
  119. PCI devices may also be filtered based on their vendor and/or device IDs,
  120. for instance "\fBpci[15b3:]:2\fR" for the third Mellanox PCI device (vendor ID 0x15b3).
  121. .
  122. OS devices may also be filtered based on their subtype,
  123. for instance "\fBos[gpu]:all\fR" for all GPU OS devices.
  124. .
  125. .TP
  126. .B Hex:
  127. For tools that manipulate object as sets (e.g. hwloc-calc and hwloc-bind),
  128. locations can also be specified as hexidecimal bitmasks prefixed
  129. .
  130. with "0x". Commas must be used to separate the hex digits into blocks
  131. of 8, such as "0xffc0140,0x00020110".
  132. .
  133. Leading zeros in each block do not need to be specified.
  134. .
  135. For example, "0xffc0140,0x20110" is equivalent to the prior example,
  136. and "0x0000000f" is exactly equivalent to "0xf". Intermediate blocks
  137. of 8 digits that are all zeoro can be left empty; "0xff0,,0x13" is
  138. equivalent to "0xff0,0x00000000,0x13".
  139. .
  140. If the location is prefixed with the special string "0xf...f", then
  141. all unspecified bits are set (as if the set were infinite). For
  142. example, "0xf...f,0x1" sets both the first bit and all bits starting
  143. with the 33rd. The string "0xf...f" -- with no other specified values
  144. -- sets all bits.
  145. .
  146. .PP
  147. "all" and "root" are special locations consisting in the root
  148. object in tree. It contains the entire current topology.
  149. .
  150. .PP
  151. Some tools directly operate on these objects (e.g. hwloc-info and hwloc-annotate).
  152. They do not support hexadecimal locations because each location may
  153. correspond to multiple objects.
  154. For instance, there can be exactly one L3 cache per package and NUMA node,
  155. which means it's the same location.
  156. .
  157. If multiple locations are given on the command-line,
  158. these tools will operation on each location individually and consecutively.
  159. .
  160. .PP
  161. Some other tools internally manipulate objects as sets (e.g. hwloc-calc and hwloc-bind).
  162. They translate each input location into a hexidecimal location.
  163. When I/O or Misc objects are used, they are translated into the set
  164. of processors (or NUMA nodes) that are close to the given object
  165. (because I/O or Misc objects do not contain processors or NUMA nodes).
  166. .
  167. .PP
  168. If multiple locations are specified on the command-line (delimited by whitespace),
  169. they are combined (the overall location is wider).
  170. .
  171. If prefixed with "~", the given location
  172. will be cleared instead of added to the current list of locations. If
  173. prefixed with "x", the given location will be and'ed instead of added
  174. to the current list. If prefixed with "^", the given location will be
  175. xor'ed.
  176. .
  177. .PP
  178. More complex operations may be performed by using
  179. .IR hwloc-calc
  180. to compute intermediate values.
  181. .
  182. .SS hwloc Objects
  183. .
  184. .PP
  185. Objects in tuples can be any of the following strings
  186. .
  187. (listed from "biggest" to "smallest"):
  188. .
  189. .TP 10
  190. .B machine
  191. A set of processors and memory.
  192. .
  193. .TP
  194. .B numanode
  195. A NUMA node; a set of processors around memory which the processors
  196. can directly access.
  197. .
  198. If \fBhbm\fR is used instead of \fBnumanode\fR in locations,
  199. command-line tools only consider high-bandwidth memory nodes such as Intel Xeon Phi MCDRAM.
  200. .
  201. .TP
  202. .B package
  203. Typically a physical package or chip, that goes into a package,
  204. it is a grouping of one or more processors.
  205. .
  206. .TP
  207. .B l1cache ... l5cache
  208. A data (or unified) cache.
  209. .
  210. .TP
  211. .B l1icache ... l3icache
  212. An instruction cache.
  213. .
  214. .TP
  215. .B core
  216. A single, physical processing unit which may still contain multiple
  217. logical processors, such as hardware threads.
  218. .
  219. .TP
  220. .B pu
  221. Short for
  222. .I processor unit
  223. (not
  224. .IR process !).
  225. The smallest physical execution unit that hwloc recognizes. For
  226. example, there may be multiple PUs on a core (e.g.,
  227. hardware threads).
  228. .PP
  229. \fBosdev\fR, \fBpcidev\fR, \fBbridge\fR, and \fBmisc\fR may also be used
  230. to specify special devices although some of them have dedicated identification
  231. ways as explained in \fBLocation Specification\fR.
  232. .
  233. .PP
  234. Finally, note that an object can be denoted by its numeric "depth" in
  235. the topology graph.
  236. .
  237. .SS hwloc Indexes
  238. Indexes are integer values that uniquely specify a given object of a
  239. specific type. Indexes can be expressed either as
  240. .I logical
  241. values or
  242. .I physical
  243. values. Most hwloc utilities accept logical indexes by default.
  244. Passing
  245. .B --physical
  246. switches to physical/OS indexes.
  247. Both logical and physical indexes are described on this man page.
  248. .
  249. .PP
  250. .I Logical
  251. indexes are relative to the object order in the output from the
  252. lstopo command. They always start with 0 and increment by 1 for each
  253. successive object.
  254. .
  255. .PP
  256. .I Physical
  257. indexes are how the operating system refers to objects. Note that
  258. while physical indexes are non-negative integer values, the hardware
  259. and/or operating system may choose arbitrary values -- they may not
  260. start with 0, and successive objects may not have consecutive values.
  261. .
  262. .PP
  263. For example, if the first few lines of lstopo -p output are the
  264. following:
  265. .
  266. Machine (47GB)
  267. NUMANode P#0 (24GB) + Package P#0 + L3 (12MB)
  268. L2 (256KB) + L1 (32KB) + Core P#0 + PU P#0
  269. L2 (256KB) + L1 (32KB) + Core P#1 + PU P#0
  270. L2 (256KB) + L1 (32KB) + Core P#2 + PU P#0
  271. L2 (256KB) + L1 (32KB) + Core P#8 + PU P#0
  272. L2 (256KB) + L1 (32KB) + Core P#9 + PU P#0
  273. L2 (256KB) + L1 (32KB) + Core P#10 + PU P#0
  274. NUMANode P#1 (24GB) + Package P#1 + L3 (12MB)
  275. L2 (256KB) + L1 (32KB) + Core P#0 + PU P#0
  276. L2 (256KB) + L1 (32KB) + Core P#1 + PU P#0
  277. L2 (256KB) + L1 (32KB) + Core P#2 + PU P#0
  278. L2 (256KB) + L1 (32KB) + Core P#8 + PU P#0
  279. L2 (256KB) + L1 (32KB) + Core P#9 + PU P#0
  280. L2 (256KB) + L1 (32KB) + Core P#10 + PU P#0
  281. In this example, the first core on the second package is logically
  282. number 6 (i.e., logically the 7th core, starting from 0). Its
  283. physical index is 0, but note that another core
  284. .I also
  285. has a physical index of 0. Hence, physical indexes may only be
  286. relevant within the scope of their parent (or set of ancestors).
  287. In this example, to uniquely identify logical core 6 with
  288. physical indexes, you must specify (at a minimum) both a package and a
  289. core: package 1, core 0.
  290. .PP
  291. Index values, regardless of whether they are logical or physical, can
  292. be expressed in several different forms (where X, Y, and N are
  293. positive integers):
  294. .
  295. .TP 10
  296. .B X
  297. The object with index value X.
  298. .
  299. .TP
  300. .B X-Y
  301. All the objects with index values >= X and <= Y.
  302. .
  303. .TP
  304. .B X-
  305. All the objects with index values >= X.
  306. .
  307. .TP
  308. .B X:N
  309. N objects starting with index X, possibly wrapping around the end of
  310. the level.
  311. .
  312. .TP
  313. .B all
  314. A special index value indicating all valid index values.
  315. .
  316. .TP
  317. .B odd
  318. A special index value indicating all valid odd index values.
  319. .
  320. .TP
  321. .B even
  322. A special index value indicating all valid even index values.
  323. .
  324. .PP
  325. .IR REMEMBER :
  326. hwloc's command line tools accept
  327. .I logical
  328. indexes for location values by default.
  329. Use
  330. .BR --physical " and " --logical
  331. to switch from one mode to another.
  332. .
  333. .\" **************************
  334. .\" See also section
  335. .\" **************************
  336. .SH SEE ALSO
  337. .
  338. hwloc's command line tool documentation: lstopo(1), hwloc-bind(1),
  339. hwloc-calc(1), hwloc-distrib(1), hwloc-ps(1).
  340. .
  341. .PP
  342. hwloc has many C API functions, each of which have their own man page.
  343. Some top-level man pages are also provided, grouping similar functions
  344. together. A few good places to start might include:
  345. hwlocality_objects(3), hwlocality_types(3), hwlocality_creation(3),
  346. hwlocality_cpuset(3), hwlocality_information(3), and
  347. hwlocality_binding(3).
  348. .
  349. .PP
  350. For a listing of all available hwloc man pages, look at all "hwloc*"
  351. files in the man1 and man3 directories.