Back to the main page.

Bug 3220 - ft_componentanalysis doesn't accept NaNs in the data

Status CLOSED FIXED
Reported 2016-12-14 14:16:00 +0100
Modified 2017-11-20 20:13:44 +0100
Product: FieldTrip
Component: core
Version: unspecified
Hardware: PC
Operating System: Linux
Importance: P5 normal
Assigned to: Jan-Mathijs Schoffelen
URL:
Tags:
Depends on: 264532183219
Blocks:
See also: http://bugzilla.fieldtriptoolbox.org/show_bug.cgi?id=3238

Pim Mostert - 2016-12-14 14:16:04 +0100

ft_componentanalysis throws the following error when the data contains NaNs: Error using schur Input to SCHUR must not contain NaN or Inf. I realize that it's not trivial that an ICA algorithm should be able to handle NaNs. However, I consider it plausible that an ICA is run after artifact rejection. Given that I reject artifacts by filling them with NaNs (ft_rejectartifact with cfg.artfctdef.reject = 'nan'), I would expect ft_componentanalysis to be able to handle that. I uploaded an example data set and script to the FTP dropbox: - tmp_script_4.m - tmp_data_small.mat This script contains comments to demonstrate the problem. Thanks!


Jan-Mathijs Schoffelen - 2016-12-14 14:32:01 +0100

why do you expect ft_componentanalysis to be nan-aware? what about ft_freqanalysis? more in general, the series of bugs you recently opened pertain to nan-awareness in general. it may be worthwile to make an inventory of where nans may cause unexpected behaviour, e.g. in ft_compoenentanalysis/ft-freqanalysis/functions of the ft_preproc module etc and where it can be addressed relatively straightforwardly e.g. by discarding the nans before the computation, and adding them back afterwards (as in ft_componentanalysis) Note that there used to be a bug that was related to inventorizing the nan-awareness of the functions in the preproc module. I cannot easily find it right now.


Jan-Mathijs Schoffelen - 2016-12-15 09:35:46 +0100

Hi Pim (this one is a question to you :o) ), Do you need the NaN-trials throughout your analysis pipeline? In other words, if ft_componentanalysis would be made nan-aware by explicitly throwing out the NaNs before estimating the mixing model, but not putting them back, would that be OK behavior for you?


Pim Mostert - 2016-12-16 16:11:54 +0100

Thanks for the fixes to the other two bugs! No, that would not be acceptable behavior for me. The reason I want to keep them, is that I ultimately want to work with the trials in chan*time*trial matrix, which implies that all trials need to be of equal length. I'll work around this issue manually for now. However, for ft_componentanalysis the solution is relative simple I think. For estimation of the weights you could simply a priori throw out the timepoints with NaNs. For the subsequent transformation of the data using these weights, a simple matrix multiplication would give the desired result, as the NaNs in the original data should result in NaNs in the transformed data, right? I encountered another problem using NaNs (would you like me to open a new bug?): when using ft_databrowser after (whole) trials have been filled with NaNs, the y-axes are set to [NaN NaN]. This is probably specifically the case due to the fact that my very first trial is entirely filled with NaNs.


Robert Oostenveld - 2016-12-19 11:23:14 +0100

(In reply to Pim Mostert from comment #3) Your solution for ICA (estimate it on the non-nan sections, apply it to the whole) is fine. Please open a separate bug for data browser.


Jan-Mathijs Schoffelen - 2016-12-19 11:28:49 +0100

Uhm, I don't think that this particular bug has been fixed. I have addressed 3218 and 3219


Robert Oostenveld - 2016-12-19 11:50:06 +0100

(In reply to Jan-Mathijs Schoffelen from comment #5) Is your idea that this is implemented in the FT code, rather than the users' script (which I assumed)?


Jan-Mathijs Schoffelen - 2016-12-19 11:56:12 +0100

Well, I think that Pim's expectation is that it is going to be built into the code, I would be fine with the users having to deal with this themselves, in which case the correct resolved status of this bug will be: 'WONTFIX' :o).


Jan-Mathijs Schoffelen - 2016-12-19 13:39:37 +0100

another random thought: why isn't ft_timelockanalysis 'gap-aware'? In other words, cfg.keeptrials gives an Nrpt x Nchan x Ntime matrix with nans for missing values. However, if at an earlier stage (ft_rejecttrial + cfg.reject = 'partial') trials have been chopped up, these subsegments end up in different slices of tlck.trial.


Pim Mostert - 2016-12-19 15:05:57 +0100

(In reply to Jan-Mathijs Schoffelen from comment #7) I'd personally say that the FT code should handle it. Doing it manually isn't all that hard, but it is pretty ugly. Moreover, I think it's fair that a user may expect these functions to handle NaNs properly. Currently, they don't, and neither do they very clearly tell you. So I'd say, either add it to the code; or (for now) make the function throw an explicit error to inform the user that he/she should do it him-/herself. Additionally, ft_rejectcomponent doesn't handle NaNs either. In this case I do get an explicit error :-), though I don't understand it: " Error using ft_apply_montage (line 262) Your input data contains NaNs in channels that are unused in the supplied montage. This would result in undesired NaNs in the output data. Please remove these channels from the input data (using ft_selectdata) before attempting to apply the montage. Error in ft_rejectcomponent (line 184) data = ft_apply_montage(data, montage, 'keepunused', keepunused, 'feedback', cfg.feedback); " I'll do the rejection manually for now.


Jan-Mathijs Schoffelen - 2016-12-23 09:01:26 +0100

Assigned to a named person, to keep it on the radar. ;o)


Robert Oostenveld - 2016-12-23 11:50:30 +0100

(In reply to Pim Mostert from comment #9) agreed that FT should do it. Having NaNs in the data allows for smart stuff, but right now FT code is not so smart yet. It would help to classify the functions according to input-output-algorithm, since I expect that there are only a limited number of patterns, which makes it possible to implement and reuse the solutions.


Jan-Mathijs Schoffelen - 2017-09-22 13:50:44 +0200

I have just implemented nan-transparency for ft_componentanalysis (throwing away the columns that have nans in the concatenated data matrix).


Jan-Mathijs Schoffelen - 2017-11-09 17:11:23 +0100

I believe that this has been addressed.