Treatment of NaN in flex #478

graeme-winter · 2020-05-18T08:24:36Z

It seems that NaN's confuse flex histogram...

    ratio = flex.log10(t_i.select(t_nn) / r_i.select(r_nn)).as_double()

    print("Min: %f" % flex.min(ratio))
    print("Number < -5: %d" % (ratio < -5).count(True))
    
    r_hist = flex.histogram(ratio, data_min=-10, data_max=10, n_slots=10)

    for c, v in zip(r_hist.slot_centers(), r_hist.slots()):
        print(c, v)

It turns out there are 6,338 NaN's in ratio, which all end up in the smallest bin:

Min: -5.173720
Number < -5: 2
-9.0 6338
-7.0 0
-5.0 9
-3.0 59
-1.0 481347
1.0 343104
3.0 61
5.0 1
7.0 0
9.0 0

I believe flex.histogram should not be treating NaN in this way...

The text was updated successfully, but these errors were encountered:

graeme-winter · 2020-05-18T08:48:48Z

      std::size_t
      get_i_slot(ValueType const& d_)
      {
        std::size_t i_slot = 0;
        ValueType d = d_ - data_min_;
        if (d != 0 && d >= slot_width_) {
              i_slot = static_cast<std::size_t>(d / slot_width_);
          if (i_slot >= slots_.size()) i_slot = slots_.size() - 1;
        }
        return i_slot;
      }

    protected:
      void
      assign_to_slot(ValueType const& d)
      {
        slots_[get_i_slot(d)]++;
      }

in scitbx/histogram.h - guess

if (not isnan(d)) {
        slots_[get_i_slot(d)]++;
}

would fix this, opinions?

graeme-winter · 2020-05-18T09:10:51Z

Update: this is not limited to flex.histogram - flex.min gives some weird results:

from scitbx.array_family import flex
import math

x = flex.double(range(-10, 100)) + 0.5
l = flex.log10(x)

for _ in l:
    if math.isnan(_):
        continue
    print(_)
    break

print(flex.min(l), flex.max(l))

gives

-0.3010299956639812
1.9800033715837464 1.9978230807457256

i.e. flex.min() gives the wrong answer if there are NaN's in the input list

graeme-winter · 2020-05-18T09:25:58Z

To be clear following the last comment: either NaN or min(!NaN) values should be returned, not some other random element. Returning NaN would be consistent with std::min and the true min value with fmin from math.h

Includes test to assert this behaviour. Means histogram behaves as expected for #478 however issues with min() and max() remain

phyy-nx · 2020-05-26T22:27:19Z

Hi, so first I note that your test behaves differently on my centos7 build and my macos build. Since #463, macos builds have removed -ffast-math, so that's likely the reason.

Test:

from scitbx.array_family import flex
import math

x = flex.double(range(-10, 100)) + 0.5
l = flex.log10(x)
print(flex.min(l), flex.max(l))

Min result on centos7: -0.3010299956639812
Min result on macos: NaN

In neither case do I get your min result of 1.9800033715837464. Not sure what is going on, but it's enough to say that how the software is compiled greatly impacts the results.

That said, I do have some context that could help inform this discussion (likely familiar to all parties). To my knowledge, cctbx doesn't claim IEEE compatibility. Indeed, until #324, we explicitly crashed developer environments when many operations that produce NaNs were executed (such as 1/0). The reason mainly is because we wanted --fast-math. So it was up to the developer to not allow their code to reach a situation where NaNs were possible.

Since #324, it could be argued that cctbx should support NaNs. This is a big statement to make. Your commit in #479 adds a check for NaN for every entry in a flex array while creating a histogram. The same check would need to be added to every flex operation, not just min and max, but mean, median, etc. That seems to me like a big overhead, but maybe modern compilers with branch prediction are good enough that this doesn't have an impact. Regardless, it's a big set of changes. Is that really what we want?

If we want to be IEEE compliant then we should drop --fast-math, but it's a big decision, likely a community one. Otherwise my vote is to keep the current policy: buyer beware if you generate NaNs. cctbx is specific for a scientific field (e.g. crystallography, cryo-EM, and modeling macromolecules), which generally means that the numbers we work with are finite. We're not like numpy, which is for general math. Usually, any algorithm that regularly generates NaN or infinity values has some issue that should be fixed at the algorithm level.

Thoughts or comments? Seems like a useful discussion to have.

Anthchirp · 2020-05-26T22:52:21Z

I just verified this using an installation of DIALS 1.1 we still keep around (March 2016, and thus well before any NaN discussions from #324). cctbx did evidently not make a guarantee that NaNs were always avoided. It just happened to crash some of the time.

$ dials.python
Python 2.7.8 (default_cci, Apr  8 2016, 15:33:21) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from scitbx.array_family import flex
>>> import math
>>> x = flex.double(range(-10, 100)) + 0.5
>>> l = flex.log10(x)
>>> list(l)
[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, -0.3010299956639812, 0.17609125905568124, 0.3979400086720376, 0.5440680443502757, 0.6532125137753437, 0.7403626894942439, 0.8129133566428556, 0.8750612633917001, 0.9294189257142927, 0.9777236052888477, 1.021189299069938, 1.0606978403536118, 1.0969100130080565, 1.130333768495006, 1.1613680022349748, 1.1903316981702914, 1.2174839442139063, 1.2430380486862944, 1.2671717284030137, 1.290034611362518, 1.3117538610557542, 1.3324384599156054, 1.3521825181113625, 1.3710678622717363, 1.3891660843645326, 1.4065401804339552, 1.423245873936808, 1.4393326938302626, 1.4548448600085102, 1.469822015978163, 1.4842998393467859, 1.4983105537896004, 1.5118833609788744, 1.5250448070368452, 1.5378190950732742, 1.550228353055094, 1.5622928644564746, 1.5740312677277188, 1.5854607295085006, 1.5965970956264601, 1.6074550232146685, 1.6180480967120927, 1.6283889300503116, 1.6384892569546374, 1.6483600109809315, 1.6580113966571124, 1.667452952889954, 1.6766936096248666, 1.6857417386022637, 1.6946051989335686, 1.7032913781186614, 1.711807229041191, 1.7201593034059568, 1.7283537820212285, 1.7363965022766423, 1.7442929831226763, 1.7520484478194385, 1.7596678446896306, 1.7671558660821804, 1.7745169657285496, 1.7817553746524688, 1.7888751157754168, 1.7958800173440752, 1.8027737252919755, 1.8095597146352678, 1.816241299991783, 1.8228216453031045, 1.8293037728310249, 1.8356905714924256, 1.841984804590114, 1.8481891169913987, 1.8543060418010806, 1.8603380065709938, 1.866287339084195, 1.8721562727482928, 1.8779469516291882, 1.8836614351536176, 1.8893017025063104, 1.8948696567452525, 1.9003671286564703, 1.9057958803678685, 1.9111576087399766, 1.916453948549925, 1.921686475483602, 1.9268567089496924, 1.9319661147281726, 1.9370161074648142, 1.9420080530223132, 1.9469432706978254, 1.951823035315912, 1.9566485792052033, 1.9614210940664483, 1.9661417327390325, 1.9708116108725178, 1.975431808509263, 1.9800033715837464, 1.9845273133437926, 1.9890046156985368, 1.9934362304976116, 1.9978230807457253]
>>> print(flex.min(l), flex.max(l))
(-0.3010299956639812, 1.9978230807457253)
>>> import dials.util.version
>>> dials.util.version.dials_version()
'DIALS 1.1.3-ga7ac5d8-release'

I do note however that in 2016 on RHEL7 it returned the correct values

Anthchirp · 2020-05-26T22:58:44Z

same results for all versions since (cs03r-sc-serv-16, RHEL 7.8 server)

$ for x in 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11 1.12 1.14 2.0 2.1 2.2; do module load dials/$x; dials.python test.py; done
DIALS 1.2.6-gc8a142d-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 1.3.4-g9f5e5cd-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 1.4.5-g37e56cb3-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 1.5.1-gf72becc3-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 1.6.4-g47a579eb-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 1.7.4-gbe933d59-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 1.8.6-gc63d8697-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 1.9.3-gb491019a-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 1.10.4-gc14d9720b-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 1.11.4-g9c616a499-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 1.12.6-gdcf0b845a-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 1.14.13-g10ecfbb15-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 2.0.4-g0252e41c3-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 2.1.5-g118c27f8c-release
(-0.3010299956639812, 1.9978230807457253)
DIALS 2.2.4-g04de204b4-release
-0.3010299956639812 1.9978230807457253

phyy-nx · 2020-05-26T23:59:27Z

@Anthchirp Regarding crashes, the #324 traps aren't present in the release builds so they won't trigger. Regardless, good to see the consistent results from the linux builds. I'd be curious if the mac builds suddenly change from -0.3 to NaN when #463 was merged.

graeme-winter · 2020-05-27T04:59:49Z

Thoughts or comments? Seems like a useful discussion to have.

Agree with this, certainly. Personally I am very much in favour of having code which behaves like IEEE says it should - robust and working trumps --ffast-math in my book.

graeme-winter changed the title ~~Treatment of NaN in flex histogram~~ Treatment of NaN in flex May 18, 2020

graeme-winter added a commit that referenced this issue May 18, 2020

Fix treatment of NaN in histogram

ceb5001

Includes test to assert this behaviour. Means histogram behaves as expected for #478 however issues with min() and max() remain

graeme-winter mentioned this issue May 18, 2020

Fix treatment of NaN in histogram #479

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Treatment of NaN in flex #478

Treatment of NaN in flex #478

graeme-winter commented May 18, 2020

graeme-winter commented May 18, 2020

graeme-winter commented May 18, 2020

graeme-winter commented May 18, 2020

phyy-nx commented May 26, 2020

Anthchirp commented May 26, 2020

Anthchirp commented May 26, 2020

phyy-nx commented May 26, 2020

graeme-winter commented May 27, 2020

Treatment of NaN in flex #478

Treatment of NaN in flex #478

Comments

graeme-winter commented May 18, 2020

graeme-winter commented May 18, 2020

graeme-winter commented May 18, 2020

graeme-winter commented May 18, 2020

phyy-nx commented May 26, 2020

Anthchirp commented May 26, 2020

Anthchirp commented May 26, 2020

phyy-nx commented May 26, 2020

graeme-winter commented May 27, 2020