Skip to content

Fix: Potential Data Leakage in Quantum Data Tutorial.#829

Open
OkuyanBoga wants to merge 1 commit intotensorflow:masterfrom
OkuyanBoga:fix-data-leakage-in-tutorial
Open

Fix: Potential Data Leakage in Quantum Data Tutorial.#829
OkuyanBoga wants to merge 1 commit intotensorflow:masterfrom
OkuyanBoga:fix-data-leakage-in-tutorial

Conversation

@OkuyanBoga
Copy link

A solution to potential data leakage in #828.

Instead of concatenating train and test sets, they should be separately dealt with when getting a stilted dataset:

In lines L745-752:

y_train_new = get_stilted_dataset(S_pqk, V_pqk, S_original, V_original)
y_test_new = get_stilted_dataset(S_pqk_test, V_pqk_test, S_test_original, V_test_original)

where spectrum is calculated separately for test set:

S_pqk_test, V_pqk_test = get_spectrum(
    tf.reshape(x_test_pqk, [-1, len(qubits) * 3]))

S_test_original, V_test_original = get_spectrum(
    tf.cast(x_test, tf.float32), gamma=0.005)

print('Eigenvectors of pqk kernel matrix for test:', V_pqk_test)
print('Eigenvectors of original kernel matrix for test:', V_test_original)

Closes #828.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@mhucka
Copy link
Member

mhucka commented Feb 25, 2026

@OkuyanBoga Thank you for this contribution. Would you be able to resolve the (simple) conflicts that have arisen? I will then review the PR.

@mhucka mhucka self-assigned this Feb 25, 2026
@mhucka mhucka added the area/docs Involves documentation – problems, ideas, requests label Feb 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/docs Involves documentation – problems, ideas, requests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Possible data leakage in quantum/docs/tutorials /quantum_data.ipynb

2 participants