Issue with Artifact Detection in R-Peak Detection using NeuroKit2 #1064

sreeshmaraj · 2025-01-23T11:53:23Z

Hi everyone,

I am using NeuroKit2 to process raw ECG signals and calculate HRV metrics. I have a dataset of over 1000 signals. Below is a sample code I am using for the analysis:

           ` data = pd.read_csv(file_path)             
            ecg_cleaned, _ = nk.ecg_process(data, sampling_rate=500, method='Neurokit')
            ecg_cleaned_signal = ecg_cleaned['ECG_Clean']
            rpeaks = nk.ecg_findpeaks(ecg_cleaned_signal, sampling_rate=500, method='neurokit')
            fixed_rpeaks = nk.signal_fixpeaks(rpeaks, sampling_rate=500, method='kubios')

            # Check the structure of fixed_rpeaks
            if isinstance(fixed_rpeaks, dict):
                rpeaks_indices = fixed_rpeaks.get('ECG_R_Peaks', [])
            elif isinstance(fixed_rpeaks, tuple):
                rpeaks_indices = fixed_rpeaks[1]  # Adjust based on the actual structure
            else:
                raise ValueError(f"Unexpected structure for fixed_rpeaks: {type(fixed_rpeaks)}")

            # Calculate time-domain HRV metrics
            hrv_time = nk.hrv_time(rpeaks_indices, sampling_rate=500)
            time_output_path = os.path.join(time_folder, f"{bin_name}_time_metrics.txt")
            hrv_time.to_csv(time_output_path, sep='\t', index_label='Metric')
            print(f"Time metrics for bin {idx + 1} saved to {time_output_path}")

            # Calculate frequency-domain HRV metrics
            hrv_frequency = nk.hrv_frequency(rpeaks_indices, sampling_rate=500, normalize=True, interpolation_rate=4, psd_method='fft', show=False)
            frequency_output_path = os.path.join(frequency_folder, f"{bin_name}_frequency_metrics.txt")
            hrv_frequency.to_csv(frequency_output_path, sep='\t', index_label='Metric')
            print(f"Frequency metrics for bin {idx + 1} saved to {frequency_output_path}")

            # Calculate nonlinear HRV metrics with additional error handling
            try:
                hrv_nonlinear = nk.hrv_nonlinear(rpeaks_indices, sampling_rate=500)
                nonlinear_output_path = os.path.join(nonlinear_folder, f"{bin_name}_nonlinear_metrics.txt")
                hrv_nonlinear.to_csv(nonlinear_output_path, sep='\t', index_label='Metric')
                print(f"Nonlinear metrics for bin {idx + 1} saved to {nonlinear_output_path}")
            except Exception as e:
                print(f"Error calculating nonlinear metrics for bin {idx + 1} of {file_name}: {e}")
                continue

        except Exception as e:
            print(f"Error processing bin {idx + 1} of {file_name}: {e}")
            continue`

While visualizing the ECG signals and R-peaks, I noticed that some signals with high-amplitude artifacts are being incorrectly identified as R-peaks. These artifacts could negatively impact the HRV calculations.

Here is an example visualization where the artifact is being detected as an R-peak:

Is there any way to clean or filter these artifacts more effectively during R-peak detection in NeuroKit2? For example:

Can I apply additional preprocessing steps to filter out high-amplitude artifacts?
Are there parameters or alternative methods in nk.ecg_findpeaks or nk.ecg_process to improve artifact rejection?
I want to ensure the detected R-peaks are accurate before calculating HRV metrics.

Any guidance, suggestions, or best practices would be greatly appreciated!

Thank you in advance! 😊

The text was updated successfully, but these errors were encountered:

welcome · 2025-01-23T11:53:26Z

Hi 👋 Thanks for reaching out and opening your first issue here! We'll try to come back to you as soon as possible. ❤️

DerAndereJohannes · 2025-01-23T14:53:54Z

Hi! It looks like your signal has some very short moments of salt/pepper noise. ecg_process does only apply a high pass filter to get rid of the drift, but does nothing to the higher frequencies as they are needed for the signal itself (other than powerline noise). Frequency processing (using frequency analysis) could be an option, however this may eat into the data that you actually want to.

I can't tell from the image itself, but if the noise is very small (and especially smaller than your r peak size in terms of sample size), then you could try and look at a median filter before applying the ECG processing. Here is an example:

# Imports
import neurokit2 as nk
import matplotlib.pyplot as plt

# Simulate ECG
ecg = nk.ecg_simulate(duration=15, heart_rate=60,sampling_rate=500)

# Simulate salt/pepper peaks
ecg[6000:6010] = 8
ecg[3200:3210] = 8

# Apply median filtering
ecg2 = nk.signal_smooth(ecg, kernel='median', size=25)

# ECG Process
signals, info = nk.ecg_process(ecg, sampling_rate=500)
signals2, info2 = nk.ecg_process(ecg2, sampling_rate=500)

# Visualize
fig, axs = plt.subplots(1, 2)
axs[0].set_title("Cleaned ECG Signal WITHOUT median smoothing")
axs[0].plot(signals['ECG_Clean'])
axs[0].set_ylim(-2, 8)
axs[1].set_title("Cleaned ECG Signal WITH median smoothing")
axs[1].plot(signals2['ECG_Clean'])
axs[1].set_ylim(-2, 8)
plt.tight_layout()
plt.show()

Let me know if this helped.

sreeshmaraj · 2025-01-23T15:57:56Z

Dear Johannes Herforth, Thank you so much for your suggestions! However, the signal_smooth function doesn't seem to be working in this case. Are there any alternative approaches you would recommend? Additionally, I’m concerned about whether the presence of peaks in the signal might impact the accuracy of HRV parameters, particularly those in the time-domain and frequency-domain (features of interest). Could this pose a significant issue for HRV analysis? Lastly, I wanted to ask if applying the standardize function could be beneficial in this context. Would it help improve the quality of the signal for HRV analysis? Any insights or recommendations would be greatly appreciated. Thank you!

…

On Thu, 23 Jan 2025 at 20:24, Johannes Herforth ***@***.***> wrote: Hi! It looks like your signal has some very short moments of salt/pepper noise. ecg_process does only apply a high pass filter to get rid of the drift, but does nothing to the higher frequencies as they are needed for the signal itself (other than powerline noise). Frequency processing (using frequency analysis) could be an option, however this may eat into the data that you actually want to. I can't tell from the image itself, but if the noise is very small (and especially smaller than your r peak size in terms of sample size), then you could try and look at a median filter before applying the ECG processing. Here is an example: image.png (view on web) <https://github.com/user-attachments/assets/02fd0487-5ee5-413e-aaa3-c4914cb70677> # Importsimport neurokit2 as nkimport matplotlib.pyplot as plt # Simulate ECGecg = nk.ecg_simulate(duration=15, heart_rate=60,sampling_rate=500) # Simulate salt/pepper peaksecg[6000:6010] = 8ecg[3200:3210] = 8 # Apply median filteringecg2 = nk.signal_smooth(ecg, kernel='median', size=25) # ECG Processsignals, info = nk.ecg_process(ecg, sampling_rate=500)signals2, info2 = nk.ecg_process(ecg2, sampling_rate=500) # Visualizefig, axs = plt.subplots(1, 2)axs[0].set_title("Cleaned ECG Signal WITHOUT median smoothing")axs[0].plot(signals['ECG_Clean'])axs[0].set_ylim(-2, 8)axs[1].set_title("Cleaned ECG Signal WITH median smoothing")axs[1].plot(signals2['ECG_Clean'])axs[1].set_ylim(-2, 8)plt.tight_layout()plt.show() Let me know if this helped. — Reply to this email directly, view it on GitHub <#1064 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A6QGLOJQ7C5DLX7KUGC3CZD2MD7BRAVCNFSM6AAAAABVXFGTDCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJQGAZDGNBTGA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

DerAndereJohannes · 2025-01-23T16:32:49Z

Hi sreeshmaraj,

Unfortunate to hear that the signal_smooth function does not seem to work properly. Is this from a data or function standpoint? I.e., does the smoothing remove too much of your actual data or does the resulting signal still contain the noise?

I'll be honest and say that I have not really done many HRV based analyses before. What I do know is that what is most important about HRV is that the r peaks are correctly classified, so if the non-cardiac peaks are being detected as rpeaks, then this is indeed a problem.

I personally do not see how standardizing the ecg signal can be beneficial, as it will just scale the signal in a different way. The fake peaks will still remain.

sreeshmaraj · 2025-01-23T17:28:44Z

Thanks for sharing your thoughts, the resulting signal still contain the noise, even after trying with different methods availble in the Neurokit Documentation signal processing section.

…

On Thu, 23 Jan, 2025, 10:03 pm Johannes Herforth, ***@***.***> wrote: Hi sreeshmaraj, Unfortunate to hear that the signal_smooth function does not seem to work properly. Is this from a data or function standpoint? I.e., does the smoothing remove too much of your actual data or does the resulting signal still contain the noise? I'll be honest and say that I have not really done many HRV based analyses before. What I do know is that what is most important about HRV is that the r peaks are correctly classified, so if the non-cardiac peaks are being detected as rpeaks, then this is indeed a problem. I personally do not see how standardizing the ecg signal can be beneficial, as it will just scale the signal in a different way. The fake peaks will still remain. — Reply to this email directly, view it on GitHub <#1064 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A6QGLOKRP5LYI7GUTMUBOQ32MEKUPAVCNFSM6AAAAABVXFGTDCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJQGMZTENJUGM> . You are receiving this because you authored the thread.Message ID: ***@***.***>

DerAndereJohannes · 2025-01-23T17:53:19Z

Ah okay, there is not much I can do without seeing your data itself. However do try and play around with the size function argument like i showed here: ecg2 = nk.signal_smooth(ecg, kernel='median', size=25) Try to increase size until the spikes go away and see if your actual ecg signal still looks the same in terms of shape.

Later you can try to find if the spikes are similar in length and find a size value for the entire dataset.

sreeshmaraj · 2025-01-24T04:07:41Z

Hello, I tried increasing the size, but it's also affecting my normal R-peaks. Additionally, I wanted to ask if the way I'm doing the preprocessing is correct since I'm new to signal analysis.

…

On Thu, 23 Jan 2025 at 23:23, Johannes Herforth ***@***.***> wrote: Ah okay, there is not much I can do without seeing your data itself. However do try and play around with the size function argument like i showed here: ecg2 = nk.signal_smooth(ecg, kernel='median', size=25) Try to increase size until the spikes go away and see if your actual ecg signal still looks the same in terms of shape. Later you can try to find if the spikes are similar in length and find a size value for the entire dataset. — Reply to this email directly, view it on GitHub <#1064 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A6QGLOLCVOZCCYPKVDHFF532MEUCJAVCNFSM6AAAAABVXFGTDCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMJQGU3TANRSGM> . You are receiving this because you authored the thread.Message ID: ***@***.***>

DerAndereJohannes · 2025-01-24T16:40:03Z

Using a median filter is a very common non-linear technique to smooth the signal. It is a valid pre-processing technique. Just be as transparent as possible in your technique and try to make as little changes as possible to not deviate too much from the original signal.

Your noise might take up more samples as I have assumed, it could be that you need more specific with your corrections. The good thing is that your noise is obvious in amplitude, so we can use this info to get better results.

Instead of a median filter, you could perhaps looks at morphological filters and focus them on the sections that have high amplitudes. Take an example of masking the outliers and applying dilation and erosion in this example:

import numpy as np
import neurokit2 as nk
import matplotlib.pyplot as plt
from scipy.ndimage import grey_erosion, grey_dilation

# Simulate ECG
ecg = nk.ecg_simulate(duration=15, heart_rate=60,sampling_rate=500)

# Simulate salt/pepper peaks
ecg[6000:6010] = 8
ecg[3200:3210] = 8
ecg[1100:1110] = -8

# Constants
size = 12
threshold = 4
padding = 5

# Create a mask
mask = np.abs(ecg) > threshold
indices = np.where(mask)[0]
chunks = np.split(indices, np.where(np.diff(indices) != 1)[0] + 1)

eroded_signal = ecg.copy()
dilated_signal = ecg.copy()

# Apply morphological filters
for chunk in chunks:
    bot = np.maximum(np.min(chunk) - padding, 0)
    top = np.minimum(np.max(chunk) + padding, len(mask))

    eroded_signal[bot+padding:top-padding+1] = grey_erosion(ecg[bot:top], size=size)[padding:-padding+1]
    dilated_signal[bot+padding:top-padding+1] = grey_dilation(eroded_signal[bot:top], size=size)[padding:-padding+1]

signals_full, infos_f = nk.ecg_process(dilated_signal, sampling_rate=500)
signals_ero, infos_e = nk.ecg_process(eroded_signal, sampling_rate=500)
signals_unchanged, infos_u = nk.ecg_process(ecg, sampling_rate=500)

# Visualize
fig, axs = plt.subplots(3, 1, figsize=(14, 7))
axs[0].plot(ecg, label='Original Signal with Noise')
axs[0].scatter(infos_u['ECG_R_Peaks'], [ecg[x] for x in infos_u['ECG_R_Peaks']], c='r')
axs[0].set_title("Original Signal with Salt and Pepper Noise")

axs[1].plot(eroded_signal, label='Eroded Signal', color='orange')
axs[1].scatter(infos_e['ECG_R_Peaks'], [eroded_signal[x] for x in infos_e['ECG_R_Peaks']], c='r')
axs[1].set_title("Signal after Erosion")

axs[2].plot(dilated_signal, label='Dilated Signal', color='green')
axs[2].scatter(infos_f['ECG_R_Peaks'], [dilated_signal[x] for x in infos_f['ECG_R_Peaks']], c='r')
axs[2].set_title("Signal after Erosion AND Dilation")

plt.tight_layout()
plt.show()

Make sure to also adjust size and threshold to be more in line with your data. Also do note that the noise sections are not perfect (please do zoom in), but it should be enough to get the r peaks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with Artifact Detection in R-Peak Detection using NeuroKit2 #1064

Issue with Artifact Detection in R-Peak Detection using NeuroKit2 #1064

sreeshmaraj commented Jan 23, 2025

welcome bot commented Jan 23, 2025

DerAndereJohannes commented Jan 23, 2025

sreeshmaraj commented Jan 23, 2025 via email

DerAndereJohannes commented Jan 23, 2025

sreeshmaraj commented Jan 23, 2025 via email

DerAndereJohannes commented Jan 23, 2025

sreeshmaraj commented Jan 24, 2025 via email

DerAndereJohannes commented Jan 24, 2025

Issue with Artifact Detection in R-Peak Detection using NeuroKit2 #1064

Issue with Artifact Detection in R-Peak Detection using NeuroKit2 #1064

Comments

sreeshmaraj commented Jan 23, 2025

welcome bot commented Jan 23, 2025

DerAndereJohannes commented Jan 23, 2025

sreeshmaraj commented Jan 23, 2025 via email

DerAndereJohannes commented Jan 23, 2025

sreeshmaraj commented Jan 23, 2025 via email

DerAndereJohannes commented Jan 23, 2025

sreeshmaraj commented Jan 24, 2025 via email

DerAndereJohannes commented Jan 24, 2025