Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NVM API] nvm_get_regions() doesn't return the correct region 'freecapacity' when namespaces exist in the region #188

Open
sscargal opened this issue Mar 17, 2022 · 2 comments

Comments

@sscargal
Copy link
Contributor

sscargal commented Mar 17, 2022

This issue was originally reported on the #pmem Slack channel by Tom Nabarro:

currently seeing a difference in region free_capacity when using the C API nvm_get_regions() (major version 2) compared to the CLI tool (same major version), any ideas: @wolf-157:~/projects/daos> sudo ipmctl show -region
SocketID | ISetID            | PersistentMemoryType | Capacity    | FreeCapacity | HealthState
==================================================================================================
0x0000  | 0x1d427f4835f32ccc | AppDirect           | 3012.000 GiB | 0.000 GiB   | Healthy
0x0001  | 0xef5a7f48cef32ccc | AppDirect           | 3012.000 GiB | 0.000 GiB   | Healthy
But
DEBUG 14:32:44.167888 ipmctl.go:323: discovered pmem regions: [{IsetId:2108387523682315468 Type:1 Capacity:3234110373888 Free_capacity:3234110373888 Socket_id:0 Dimm_count:6 Dimms:[1 17 33 257 273 289 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] Health:1 Reserved:[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]} {IsetId:17247237673655151820 Type:1 Capacity:3234110373888 Free_capacity:3234110373888 Socket_id:1 Dimm_count:6 Dimms:[4097 4113 4129 4353 4369 4385 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] Health:1 Reserved:[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]}]

I responded with:

We can see from Region.c#L1165 that the NamespaceCapacityUsed is subtracted from the pRegion->Size to get the pFreeCapacity value
*pFreeCapacity = pRegion->Size - NamespaceCapacityUsed;
It would be best to open an issue on https://github.com/intel/ipmctl/issues to get feedback from the developers, but it looks like the call to nvm_get_regions() isn't subtracting the namespace capacity.

After investigating further, the following C++ code reproduces Tom's issue:

#include <iostream>
#include <nvm_management.h>

using std::cout;
using std::endl;

/*
 * Build using:
 *   $ g++ -o ipmctlAPItest -lipmctl ipmctlAPItest.cpp
 */

int main()
{
  NVM_UINT8 num_of_regions;
  int retval;

  // Get the number of regions
  if (( retval = nvm_get_number_of_regions(&num_of_regions)) != NVM_SUCCESS)
  {
    cout << "Error: nvm_get_number_of_regions returned " << retval << endl;
  } else {
    cout << "Number of regions: " << unsigned(num_of_regions) << endl;
  }

  // Get the region information
  region *p_region = (region *)malloc(sizeof(region)*num_of_regions);
  if (p_region == NULL) {
    cout << "Error: Cannot allocate memory." << endl;
    return EXIT_FAILURE;
  };
  if (( retval = nvm_get_regions(p_region, &num_of_regions)) != NVM_SUCCESS) {
    cout << "Error: nvm_get_regions returned " << unsigned(retval) << endl;
  }

  // Iterate over the region structs
  region *p_region_itter = p_region;
  for (int i=0; i<num_of_regions; i++) {
    cout << "p_region[" << i << "]->isetId: " << p_region_itter->isetId << endl
         << "p_region[" << i << "]->capacity: " << p_region_itter->capacity << endl
         << "p_region[" << i << "]->free_capacity: " <<  p_region_itter->free_capacity << endl;
    p_region_itter++;
  }

  // Free the memory
  free(p_region);

  // Exit
  return EXIT_SUCCESS;
}

Returns:

# ./ipmctlAPItest
Number of regions: 2
p_region[0]->isetId: 3259620181632232652
p_region[0]->capacity: 1623497637888
p_region[0]->free_capacity: 1623497637888
p_region[1]->isetId: 15940630830656007372
p_region[1]->capacity: 1623497637888
p_region[1]->free_capacity: 1623497637888

One would expect the free_capacity to have been calculated and should be zero (0) on my system as demonstrated by ipmctl show -region:

# ipmctl show -region
 SocketID | ISetID             | PersistentMemoryType | Capacity     | FreeCapacity | HealthState
==================================================================================================
 0x0000   | 0x2d3c7f48f4e22ccc | AppDirect            | 1512.000 GiB | 0.000 GiB    | Healthy
 0x0001   | 0xdd387f488ce42ccc | AppDirect            | 1512.000 GiB | 0.000 GiB    | Healthy

nvm_get_regions() calls nvm_get_regions_ex() which populates the region struct with the results from gNvmDimmDriverNvmDimmConfig.GetRegions()

From nvm_management.c:

NVM_API int nvm_get_regions_ex(const NVM_BOOL use_nfit, struct region *p_regions, NVM_UINT8 *count)
{
  COMMAND_STATUS *pCommandStatus = NULL;
  NVM_UINT8 RegionCount, Index, DimmIndex;
  REGION_INFO *pRegions = NULL;

  ...

  erc = gNvmDimmDriverNvmDimmConfig.GetRegions(&gNvmDimmDriverNvmDimmConfig, RegionCount, use_nfit, pRegions, pCommandStatus);

  ...
  
  for (Index = 0; Index < RegionCount; Index++) {
    memset(&p_regions[Index], 0, sizeof(struct region));
    p_regions[Index].socket_id = pRegions[Index].SocketId;
    p_regions[Index].isetId = pRegions[Index].CookieId;
    p_regions[Index].capacity = pRegions[Index].Capacity;
    p_regions[Index].free_capacity = pRegions[Index].FreeCapacity;
    p_regions[Index].health = pRegions[Index].Health;
    p_regions[Index].type = pRegions[Index].RegionType;
    p_regions[Index].dimm_count = pRegions[Index].DimmIdCount;

    for (DimmIndex = 0; DimmIndex < pRegions[Index].DimmIdCount; DimmIndex++)
      p_regions[Index].dimms[DimmIndex] = pRegions[Index].DimmId[DimmIndex];
  }
   
  ...
}  

Where

/* Region Information provides details about a PMEM region (interleave set).*/
typedef struct _REGION_INFO {
  UINT16 RegionId;                  ///< Region identifier
  UINT16 SocketId;                  ///< Socket identifier
  UINT8 RegionType;                 ///< Region type
  UINT64 Capacity;                  ///< Region total raw capacity
  UINT64 FreeCapacity;              ///< Region total free capacity. Raw less capacity used by namespaces
  UINT64 AppDirNamespaceMaxSize;    ///< Maximum size of an AppDirect namespace
  UINT64 AppDirNamespaceMinSize;    ///< Minimum size of an AppDirect namespace
  UINT16 Health;                    ///< Health state of region
  UINT16 DimmId[12];                ///< PMem module IDs associated with this region
  UINT16 DimmIdCount;               ///< Number of PMem modules found in DimmId
  UINT64 CookieId;                  ///< Interleave set ID
  HII_POINTER PtrInterlaveFormats;  ///< Pointer to array of Interleave Formats
  UINT32 InterleaveFormatsNum;      ///< Number of Interleave Formats
} REGION_INFO;

Note, the comment for FreeCapacity says "Region total free capacity. Raw less capacity used by namespaces", so this isn't working as documented.

@StevenPontsler
Copy link
Contributor

We will take a look at it.

@tanabarr
Copy link

Have reverted to retrieving free capacity by scraping ipmctl cli output (actually because nvm_get_regions_ex(use_nfit=false, ...) takes a minute to return) but it would be preferable to be able to use libipmctl api. I found that setting use_nfit=false and calling nvm_uninit() between calls results in the FreeCapacity stat being updated as expected. Unfortunately the latency in the call is not acceptable hence the reason for the switch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants