Skip to content

Add CalCombinedEnergyCalib + CalEnergyCalibInfo CID link; update CalSourceEnergyCalib (ndf + CSV fix)#1718

Open
zwl0331 wants to merge 19 commits intoMu2e:mainfrom
zwl0331:main
Open

Add CalCombinedEnergyCalib + CalEnergyCalibInfo CID link; update CalSourceEnergyCalib (ndf + CSV fix)#1718
zwl0331 wants to merge 19 commits intoMu2e:mainfrom
zwl0331:main

Conversation

@zwl0331
Copy link

@zwl0331 zwl0331 commented Feb 11, 2026

Summary

This PR updates calorimeter energy calibration table code and naming, while keeping database DDL changes out of the PR.

Changes

  1. Extended CalSourceEnergyCalib with ndf (row definition, query/schema string, parsing, and CSV serialization).
  2. Fixed a pre-existing CalSourceEnergyCalib::rowToCsv omission: frFull, frFirst, and frSecond are now serialized.
  3. Added archive table CalCombinedEnergyCalib (cal.combinedenergycalib) for per-channel combined ADC2MeV constants, uncertainty, status code, and status message.
  4. Added ad-hoc table CalEnergyCalibInfo (cal.energycalibinfo) to link CalEnergyCalib CID to CalCombinedEnergyCalib CID.
  5. Moved combined-calibration status codes to shared include CaloConditions/inc/CalCombinedEnergyCalibStatus.hh.
  6. Replaced by-value + std::move row constructors with const std::string& in the updated calibration info table rows.
  7. Simplified CSV handling per review guidance: direct column reads and quoted string output in rowToCsv.
  8. Removed DbService/data/create.sql changes from this PR.

Behavior Notes

  • CalSourceEnergyCalib CSV output now includes frFull, frFirst, frSecond, and appended ndf.

@FNALbuild
Copy link
Collaborator

Hi @zwl0331,
You have proposed changes to files in these packages:

  • DbTables

which require these tests: build.

@Mu2e/write, @Mu2e/fnalbuild-users have access to CI actions on main.

⌛ The following tests have been triggered for 31af759: build (Build queue - API unavailable)

About FNALbuild. Code review on Mu2e/Offline.

@FNALbuild
Copy link
Collaborator

☀️ The build tests passed at 31af759.

Test Result Details
test with Command did not list any other PRs to include
merge Merged 31af759 at a500268
build (prof) Log file. Build time: 04 min 09 sec
ceSimReco Log file.
g4test_03MT Log file.
transportOnly Log file.
POT Log file.
g4study Log file.
cosmicSimReco Log file.
cosmicOffSpill Log file.
ceSteps Log file.
ceDigi Log file.
muDauSteps Log file.
ceMix Log file.
rootOverlaps Log file.
g4surfaceCheck Log file.
trigger Log file.
check_cmake Log file.
FIXME, TODO TODO (0) FIXME (0) in 3 files
clang-tidy 0 errors 0 warnings
whitespace check no whitespace errors found

N.B. These results were obtained from a build of this Pull Request at 31af759 after being merged into the base branch at a500268.

For more information, please check the job page here.
Build artifacts are deleted after 5 days. If this is not desired, select Keep this build forever on the job page.

@rlcee
Copy link
Collaborator

rlcee commented Feb 11, 2026

Can you describe a bit more what purpose of CalEnergyCalibInfo? The "info" in the name implies it is an "adhoc" table, but it doesn't have the adhoc flag. (example of adhoc is CalCosmicEnergyCalibInfo) It is structured and seems to have the purpose of an archive table, like CalSourceEnergyCalib, and so if the intent is to record, channel by channel, the result of combining the calib sources, I'd expect it to be named more like CalCombinedEnergyCalib to keep in more in parallel with the other archive tables. Thanks

@rlcee rlcee self-requested a review February 11, 2026 14:53
@oksuzian oksuzian requested a review from giro94 February 11, 2026 16:04
@giro94
Copy link
Collaborator

giro94 commented Feb 11, 2026

I second Ray's comments. You could call this CalCombinedEnergyCalib and will be essentially identical to the production CalEnergyCalib but with additional information for each row. And, being an archive table, no production maintenance on it. Then, you could think of creating a CalCombinedEnergyCalibInfo adhoc table storing the CIDs of the cosmics and source tables used for the combination (and maybe the CID of the production table?).

@oksuzian
Copy link
Collaborator

I'll review this pull request by first loading the PR details and its changes.Let me get the actual file changes in this PR.Now let me look at the existing similar table (CalCosmicEnergyCalibInfo) for comparison, and the original CalSourceEnergyCalib to understand the full context:Now I have a thorough understanding of the PR. Let me provide my review.


PR Review: #1718

This PR makes two changes: (1) adds an ndf field to CalSourceEnergyCalib, and (2) adds a new CalEnergyCalibInfo table. I found several issues ranging from a bug to style/design concerns.


🐛 Bug: Pre-existing rowToCsv bug fixed, but fix is incomplete

The original CalSourceEnergyCalib::rowToCsv on main was already buggy — it was missing the frFull, frFirst, and frSecond fields. It jumped from secescErrWidth directly to chisq:

    void rowToCsv(std::ostringstream& sstream, std::size_t irow) const override {
      Row const& r = _rows.at(irow);
      sstream << std::fixed << std::setprecision(5);
      sstream << r.roid()<<",";
      sstream << r.fullEPeak()<<",";
      ...
      sstream << r.secescErrWidth()<<",";
      sstream << r.chisq();  // <-- frFull, frFirst, frSecond are missing!
    }

This PR fixes that pre-existing bug by adding the missing fields. This is good, but it should be explicitly mentioned in the PR description as a bug fix, since it changes the CSV output format independently of the ndf addition. Consumers of this CSV data will see a format change.


⚠️ Issue 1: status_message with commas will break CSV parsing

In CalEnergyCalibInfo, the status_message field is a free-form std::string written directly to CSV without any quoting/escaping:

    sstream << r.status_code() << ",";
    sstream << r.status_message();

And in addRow, it's read as simply columns[4]:

    _rows.emplace_back(CaloSiPMId(index), std::stof(columns[1]),
                       std::stof(columns[2]), std::stoi(columns[3]),
                       columns[4]);

If status_message ever contains a comma, CSV parsing will break — the message will be split across multiple column indices. Since this is the last column it partially mitigates the issue (the base DbTable parser would need to handle it), but it's still fragile. Consider:

  • Documenting that commas are forbidden in status_message, or
  • Using a fixed enum/int instead of a free-form string, or
  • Ensuring the DbTable CSV parser treats the last column as "rest of line"

⚠️ Issue 2: Narrowing conversion in addRowstd::stoul to std::uint16_t

    std::uint16_t index = std::stoul(columns[0]);

std::stoul returns unsigned long, which is silently narrowed to uint16_t. If the parsed value exceeds 65535, it will silently wrap. This is the same pattern used in the existing CalSourceEnergyCalib, so it's consistent with the codebase, but worth noting. The subsequent bounds check against CaloConst::_nChannelDB does provide a safety net.


⚠️ Issue 3: std::string status_message passed by value in constructor

    Row(CaloSiPMId roid, float ADC2MeV, float ADC2MeV_err,
        int status_code, std::string status_message)
        : _roid(roid),
          _ADC2MeV(ADC2MeV),
          _ADC2MeV_err(ADC2MeV_err),
          _status_code(status_code),
          _status_message(status_message) {}

The status_message parameter is taken by value but then copied into _status_message. It should either be taken by const std::string& (to avoid one extra copy) or moved via std::move(status_message) in the initializer list. For example:

_status_message(std::move(status_message)) {}

This also applies to the existing CalCosmicEnergyCalibInfo (same pattern), but for new code it's worth getting right.


💡 Issue 4: std::setprecision(5) applied to integer fields

In CalEnergyCalibInfo::rowToCsv:

    sstream << std::fixed << std::setprecision(5);
    sstream << r.roid() << ",";
    sstream << r.ADC2MeV() << ",";
    sstream << r.ADC2MeV_err() << ",";
    sstream << r.status_code() << ",";
    sstream << r.status_message();

std::fixed with setprecision(5) doesn't affect integer output, but it does affect roid() if CaloSiPMId's operator<< outputs a float-like value. This is consistent with the codebase pattern, so it's minor.


💡 Issue 5: Inconsistent coding style between the new file and existing codebase

The new CalEnergyCalibInfo.hh uses a different formatting style (indentation, brace placement, spacing) compared to the existing tables like CalSourceEnergyCalib.hh and CalCosmicEnergyCalibInfo.hh. For example:

  • New file uses 2-space indent with braces on same line (formatted style)
  • Existing files use mixed indentation with more compact formatting

This is a minor consistency point but worth aligning for maintainability.


💡 Issue 6: Missing #include <vector> in CalEnergyCalibInfo.hh

The new file includes <vector> explicitly, which is good. However, it does not include "Offline/DataProducts/inc/CaloConst.hh" — wait, it does include it. And it doesn't include <map> like CalSourceEnergyCalib.hh does (though <map> isn't actually used there either). This is fine.


Summary

Severity Issue Location
🐛 Bug (pre-existing, fixed here) rowToCsv was missing frFull, frFirst, frSecond fields — this is a behavior change that should be documented CalSourceEnergyCalib.hh
⚠️ Medium status_message string in CSV can break parsing if it contains commas CalEnergyCalibInfo.hh
⚠️ Low std::string passed by value instead of by const-ref or moved CalEnergyCalibInfo.hh Row constructor
💡 Nit Inconsistent code style with existing tables CalEnergyCalibInfo.hh

The most important item to address is Issue 1 (the status_message CSV safety concern). The pre-existing rowToCsv bug fix is a welcome improvement but should be called out in the PR description since it changes behavior beyond just adding ndf.

@FNALbuild
Copy link
Collaborator

📝 The HEAD of main has changed to 10120d9. Tests are now out of date.

zwl0331 and others added 5 commits February 22, 2026 23:30
Quote/escape status_message in rowToCsv and unquote on read in addRow.
Also treat trailing columns as part of the last field so messages with commas
round-trip correctly instead of breaking CSV parsing.
@zwl0331
Copy link
Author

zwl0331 commented Feb 23, 2026

Can you describe a bit more what purpose of CalEnergyCalibInfo? The "info" in the name implies it is an "adhoc" table, but it doesn't have the adhoc flag. (example of adhoc is CalCosmicEnergyCalibInfo) It is structured and seems to have the purpose of an archive table, like CalSourceEnergyCalib, and so if the intent is to record, channel by channel, the result of combining the calib sources, I'd expect it to be named more like CalCombinedEnergyCalib to keep in more in parallel with the other archive tables. Thanks

The purpose of it is to store extra information from the combination algorithm such as the error on the ADCtoMeV constant and the source of this constant (such as "only using cosmic" or "averaging cosmic and source"). I have flagged it as an adhoc table.

I named it a "CalEnergyCalibInfo" table because it is meant to be a table to complement the information in the "CalEnergyCalib" table, just like how "CalCosmicEnergyCalibInfo" is meant to complement "CalCosmicEnergyCalib" table.

@oksuzian
Copy link
Collaborator

@FNALbuild run build test

@FNALbuild
Copy link
Collaborator

⌛ The following tests have been triggered for 71d1484: build (Build queue - API unavailable)

@FNALbuild
Copy link
Collaborator

☀️ The build tests passed at 71d1484.

Test Result Details
test with Command did not list any other PRs to include
merge Merged 71d1484 at 3e5e0e6
build (prof) Log file. Build time: 04 min 11 sec
ceSimReco Log file.
g4test_03MT Log file.
transportOnly Log file.
POT Log file.
g4study Log file.
cosmicSimReco Log file.
cosmicOffSpill Log file.
ceSteps Log file.
ceDigi Log file.
muDauSteps Log file.
ceMix Log file.
rootOverlaps Log file.
g4surfaceCheck Log file.
trigger Log file.
check_cmake Log file.
FIXME, TODO TODO (0) FIXME (0) in 4 files
clang-tidy 0 errors 0 warnings
whitespace check no whitespace errors found

N.B. These results were obtained from a build of this Pull Request at 71d1484 after being merged into the base branch at 3e5e0e6.

For more information, please check the job page here.
Build artifacts are deleted after 5 days. If this is not desired, select Keep this build forever on the job page.

@zwl0331
Copy link
Author

zwl0331 commented Mar 3, 2026

I have flagged it as an adhoc table.

@zwl0331 adhoc tables work a bit differently as each commit is added as a row. We should make this a standard archive table (which has nothing different wrt a production table except for how we commit it to the database). I propose we could add another ad-hoc table to be able to link the CIDs of CalEnergyCalib and CalEnergyCalibInfo. Can we discuss this on Friday?

I removed the ad-hoc treatment for CalEnergyCalibInfo and added a dedicated ad-hoc table CalEnergyCalibInfoLink to link CIDs:

  • energycalibcid
  • energycalibinfocid
  • comment

This keeps the per-channel table as archive data and uses ad-hoc rows only for commit-level provenance.

@zwl0331
Copy link
Author

zwl0331 commented Mar 3, 2026

I'll review this pull request by first loading the PR details and its changes.Let me get the actual file changes in this PR.Now let me look at the existing similar table (CalCosmicEnergyCalibInfo) for comparison, and the original CalSourceEnergyCalib to understand the full context:Now I have a thorough understanding of the PR. Let me provide my review.

PR Review: #1718

This PR makes two changes: (1) adds an ndf field to CalSourceEnergyCalib, and (2) adds a new CalEnergyCalibInfo table. I found several issues ranging from a bug to style/design concerns.

🐛 Bug: Pre-existing rowToCsv bug fixed, but fix is incomplete

The original CalSourceEnergyCalib::rowToCsv on main was already buggy — it was missing the frFull, frFirst, and frSecond fields. It jumped from secescErrWidth directly to chisq:

    void rowToCsv(std::ostringstream& sstream, std::size_t irow) const override {
      Row const& r = _rows.at(irow);
      sstream << std::fixed << std::setprecision(5);
      sstream << r.roid()<<",";
      sstream << r.fullEPeak()<<",";
      ...
      sstream << r.secescErrWidth()<<",";
      sstream << r.chisq();  // <-- frFull, frFirst, frSecond are missing!
    }

This PR fixes that pre-existing bug by adding the missing fields. This is good, but it should be explicitly mentioned in the PR description as a bug fix, since it changes the CSV output format independently of the ndf addition. Consumers of this CSV data will see a format change.

⚠️ Issue 1: status_message with commas will break CSV parsing

In CalEnergyCalibInfo, the status_message field is a free-form std::string written directly to CSV without any quoting/escaping:

    sstream << r.status_code() << ",";
    sstream << r.status_message();

And in addRow, it's read as simply columns[4]:

    _rows.emplace_back(CaloSiPMId(index), std::stof(columns[1]),
                       std::stof(columns[2]), std::stoi(columns[3]),
                       columns[4]);

If status_message ever contains a comma, CSV parsing will break — the message will be split across multiple column indices. Since this is the last column it partially mitigates the issue (the base DbTable parser would need to handle it), but it's still fragile. Consider:

  • Documenting that commas are forbidden in status_message, or
  • Using a fixed enum/int instead of a free-form string, or
  • Ensuring the DbTable CSV parser treats the last column as "rest of line"

⚠️ Issue 2: Narrowing conversion in addRowstd::stoul to std::uint16_t

    std::uint16_t index = std::stoul(columns[0]);

std::stoul returns unsigned long, which is silently narrowed to uint16_t. If the parsed value exceeds 65535, it will silently wrap. This is the same pattern used in the existing CalSourceEnergyCalib, so it's consistent with the codebase, but worth noting. The subsequent bounds check against CaloConst::_nChannelDB does provide a safety net.

⚠️ Issue 3: std::string status_message passed by value in constructor

    Row(CaloSiPMId roid, float ADC2MeV, float ADC2MeV_err,
        int status_code, std::string status_message)
        : _roid(roid),
          _ADC2MeV(ADC2MeV),
          _ADC2MeV_err(ADC2MeV_err),
          _status_code(status_code),
          _status_message(status_message) {}

The status_message parameter is taken by value but then copied into _status_message. It should either be taken by const std::string& (to avoid one extra copy) or moved via std::move(status_message) in the initializer list. For example:

_status_message(std::move(status_message)) {}

This also applies to the existing CalCosmicEnergyCalibInfo (same pattern), but for new code it's worth getting right.

💡 Issue 4: std::setprecision(5) applied to integer fields

In CalEnergyCalibInfo::rowToCsv:

    sstream << std::fixed << std::setprecision(5);
    sstream << r.roid() << ",";
    sstream << r.ADC2MeV() << ",";
    sstream << r.ADC2MeV_err() << ",";
    sstream << r.status_code() << ",";
    sstream << r.status_message();

std::fixed with setprecision(5) doesn't affect integer output, but it does affect roid() if CaloSiPMId's operator<< outputs a float-like value. This is consistent with the codebase pattern, so it's minor.

💡 Issue 5: Inconsistent coding style between the new file and existing codebase

The new CalEnergyCalibInfo.hh uses a different formatting style (indentation, brace placement, spacing) compared to the existing tables like CalSourceEnergyCalib.hh and CalCosmicEnergyCalibInfo.hh. For example:

  • New file uses 2-space indent with braces on same line (formatted style)
  • Existing files use mixed indentation with more compact formatting

This is a minor consistency point but worth aligning for maintainability.

💡 Issue 6: Missing #include <vector> in CalEnergyCalibInfo.hh

The new file includes <vector> explicitly, which is good. However, it does not include "Offline/DataProducts/inc/CaloConst.hh" — wait, it does include it. And it doesn't include <map> like CalSourceEnergyCalib.hh does (though <map> isn't actually used there either). This is fine.

Summary

Severity Issue Location
🐛 Bug (pre-existing, fixed here) rowToCsv was missing frFull, frFirst, frSecond fields — this is a behavior change that should be documented CalSourceEnergyCalib.hh
⚠️ Medium status_message string in CSV can break parsing if it contains commas CalEnergyCalibInfo.hh
⚠️ Low std::string passed by value instead of by const-ref or moved CalEnergyCalibInfo.hh Row constructor
💡 Nit Inconsistent code style with existing tables CalEnergyCalibInfo.hh
The most important item to address is Issue 1 (the status_message CSV safety concern). The pre-existing rowToCsv bug fix is a welcome improvement but should be called out in the PR description since it changes behavior beyond just adding ndf.

Thanks for the detailed review.

Addressed in this PR:

  • Fixed the pre-existing CalSourceEnergyCalib::rowToCsv omission (frFull, frFirst, frSecond are now serialized).
  • Added ndf support in CalSourceEnergyCalib.
  • Fixed CSV robustness in CalEnergyCalibInfo for status_message (quotes/commas round-trip correctly).
  • Updated move semantics for status_message in the row constructor.
  • Added explicit CID linkage table (CalEnergyCalibInfoLink) as ad-hoc metadata.

Not changed intentionally in this PR:

  • std::stoul to uint16_t pattern (matches existing table code patterns; bounds checks remain in place).
  • Style-only formatting nits (kept scope on behavior/correctness and requested schema/provenance changes).

@zwl0331
Copy link
Author

zwl0331 commented Mar 3, 2026

@FNALbuild run build test

@FNALbuild
Copy link
Collaborator

⌛ The following tests have been triggered for bc3a3eb: build (Build queue - API unavailable)

@FNALbuild
Copy link
Collaborator

☀️ The build tests passed at bc3a3eb.

Test Result Details
test with Command did not list any other PRs to include
merge Merged bc3a3eb at 191ad5d
build (prof) Log file. Build time: 09 min 05 sec
ceSimReco Log file.
g4test_03MT Log file.
transportOnly Log file.
POT Log file.
g4study Log file.
cosmicSimReco Log file.
cosmicOffSpill Log file.
ceSteps Log file.
ceDigi Log file.
muDauSteps Log file.
ceMix Log file.
rootOverlaps Log file.
g4surfaceCheck Log file.
trigger Log file.
check_cmake Log file.
FIXME, TODO TODO (0) FIXME (0) in 5 files
clang-tidy 0 errors 0 warnings
whitespace check no whitespace errors found

N.B. These results were obtained from a build of this Pull Request at bc3a3eb after being merged into the base branch at 191ad5d.

For more information, please check the job page here.
Build artifacts are deleted after 5 days. If this is not desired, select Keep this build forever on the job page.

@zwl0331
Copy link
Author

zwl0331 commented Mar 3, 2026

Hi @rlcee @giro94 @oksuzian — I pushed an update addressing the open review points.

Main updates:

  • Explicitly documented the pre-existing CalSourceEnergyCalib::rowToCsv bug fix (frFull/frFirst/frSecond now serialized) and the ndf addition.
  • Improved CalEnergyCalibInfo CSV handling so status_message with commas/quotes round-trips correctly.
  • Added a dedicated ad-hoc CID link table (CalEnergyCalibInfoLink) to map CalEnergyCalib CID <-> CalEnergyCalibInfo CID, while keeping CalEnergyCalibInfo as archive-style content.

Could you please take another look when convenient? If this matches what you had in mind, I’d appreciate approval and merge.

@oksuzian
Copy link
Collaborator

oksuzian commented Mar 3, 2026

I'll review this pull request by first loading the PR details and then examining the code changes.Now let me look at the existing CalCosmicEnergyCalibInfo and CalSourceEnergyCalib base files for comparison, and the create.sql for the existing cal.energycalibinfo DDL:Now let me check the existing CalSourceEnergyCalib on main to compare:Now I have a thorough understanding of all the changed files. Here is my review:


PR #1718 Review — Issues Found

🔴 Issue 1: Missing SQL DDL for cal.sourceenergycalibndf column not added

The C++ code adds a new ndf column to CalSourceEnergyCalib (schema string, Row constructor, addRow, rowToCsv), but the SQL schema in create.sql does not add a corresponding ndf column to the cal.sourceenergycalib table. There is no existing DDL for sourceenergycalib in create.sql either — but if this table is managed by the val system and already deployed in production, an ALTER TABLE migration is needed. At minimum, if the database schema is supposed to match the C++ code, the ndf INTEGER column is missing.

🔴 Issue 2: Missing SQL DDL for cal.energycalibinfo table

The new CalEnergyCalibInfo class references "cal.energycalibinfo" as its db name, but no CREATE TABLE cal.energycalibinfo statement was added to create.sql. Only cal.energycalibinfolink was added. If this table is expected to be created via this DDL file, the definition is missing.

🟡 Issue 3: CalEnergyCalibInfoLink::addRow — no bounds check on columns size

  void addRow(const std::vector<std::string>& columns) override {
    _rows.emplace_back(std::stoi(columns[0]), std::stoi(columns[1]), columns[2]);
  }

If columns has fewer than 3 elements, this will produce an out-of-bounds access. Other tables in the codebase also lack this check, so it may be consistent with convention — but for a link table that stores free-form text (comment), it's more likely to encounter short rows. Consider adding a guard or at least defaulting comment to "" when columns.size() < 3.

🟡 Issue 4: CalEnergyCalibInfoLinkcomment field has the same unquoted-CSV problem that was fixed for CalEnergyCalibInfo

The comment field in CalEnergyCalibInfoLink is written and read as raw unquoted text:

  void rowToCsv(std::ostringstream& sstream, std::size_t irow) const override {
    Row const& r = _rows.at(irow);
    sstream << r.energyCalibCid() << ",";
    sstream << r.energyCalibInfoCid() << ",";
    sstream << r.comment();   // <-- not quoted
  }

If the comment text ever contains commas or double-quotes, CSV round-tripping will break — the exact problem that was fixed for CalEnergyCalibInfo::status_message in this same PR. The toCsvText/fromCsvText helpers should be reused here (or factored into a shared utility).

🟡 Issue 5: fromCsvText has an off-by-one boundary check that can drop a trailing escaped quote

  static std::string fromCsvText(const std::string& text) {
    // ...
    for (std::size_t i = 1; i + 1 < text.size(); ++i) {
      if (text[i] == '"' && i + 2 < text.size() && text[i + 1] == '"') {
        //                    ^^^^^^^^^^^^^^ should be i + 2 <= text.size() - 1

The condition i + 2 < text.size() means that when a doubled-quote "" appears right before the closing ", i.e. the sequence is ..."" followed by the final ", the check fails because i + 2 == text.size() - 1 is not < text.size(). This causes the inner " to be treated as a literal character rather than being unescaped. The same issue applies to the \\\" branch. The fix should be i + 1 < text.size() - 1 (or equivalently i + 2 < text.size() is actually fine for text[i+1] access, but the semantic intent is wrong — the loop already stops at i + 1 < text.size(), so position i+1 is the closing quote when i + 2 == text.size()).

Concrete example: Input "hello"" (7 chars, represents hello"). At i = 6, the loop stops because 6 + 1 < 7 is false. At i = 5, text[5] == '"' and 5 + 2 < 7 is false, so the doubled-quote at positions 5–6 is not recognized. Output: hello" — actually this specific case works because the raw char is " which is pushed. But for "ab""cd" (8 chars) at i = 3: text[3] == '"', 3 + 2 < 8 → true, works fine. Edge case """" (4 chars = escaped single ") at i = 1: text[1] == '"', 1 + 2 < 4 → true, works. So the boundary check is actually fine in practice for the "" escape. However, the \\\" branch has the same guard and is more suspect. Overall this is a minor concern.

🟡 Issue 6: CalCosmicEnergyCalibInfostd::move applied to by-value parameters that are already copies

    Row(int cID, int FirstCalibRun, int LastCalibRun, std::string EnergyMethod, std::string FitMethod, std::string Comment) :
        // ...
        _EnergyMethod(std::move(EnergyMethod)),
        _FitMethod(std::move(FitMethod)),
        _Comment(std::move(Comment)) {}

This is correct and beneficial — moving from by-value parameters avoids an extra copy. But the constructor signature still takes std::string by value (not by const reference), so callers that pass lvalues will copy into the parameter and then move. This is the idiomatic C++ pattern, so this change is fine. No issue here.

🔵 Issue 7 (Nit): CalEnergyCalibInfostatus_message() returns by value, not by const&

    std::string status_message() const { return _status_message; }

This copies the string on every call. Should be const std::string& for consistency and to avoid unnecessary allocations. The same pattern exists in CalEnergyCalibInfoLink::comment() and in the existing CalCosmicEnergyCalibInfo, so this may just be matching convention, but it's worth noting.

🔵 Issue 8 (Nit): CalEnergyCalibInfotoCsvText/fromCsvText are private helpers that could be shared

Since CalEnergyCalibInfoLink needs the same CSV quoting logic (see Issue 4), these should ideally be factored into a common base or utility, rather than being private static methods in one class.


Summary

# Severity File Issue
1 🔴 High create.sql Missing ndf column for cal.sourceenergycalib
2 🔴 High create.sql Missing CREATE TABLE cal.energycalibinfo DDL
3 🟡 Medium CalEnergyCalibInfoLink.hh No bounds check on columns in addRow
4 🟡 Medium CalEnergyCalibInfoLink.hh comment field not CSV-quoted (same bug fixed elsewhere in this PR)
5 🟡 Low CalEnergyCalibInfo.hh fromCsvText boundary check may miss edge-case escaped quotes
6 ✅ OK CalCosmicEnergyCalibInfo.hh std::move change is correct
7 🔵 Nit CalEnergyCalibInfo.hh / CalEnergyCalibInfoLink.hh String accessors return by value, not const&
8 🔵 Nit CalEnergyCalibInfo.hh CSV quoting helpers should be shared/reusable

Issues #1 and #2 are the most critical — the SQL DDL does not match the C++ table definitions, which will cause runtime failures when the tables are created or queried against a fresh database.

zwl0331 added 4 commits March 2, 2026 22:04
Define cal.energycalibinfo with schema matching CalEnergyCalibInfo:
(cid, roid, adc2mev, adc2mev_err, status_code, status_message)
and add primary key and grants.
Make CalEnergyCalibInfoLink comment field CSV-safe by quoting on output and
unquoting on input, and preserve compatibility by joining trailing parsed
columns when commas are present.
Add a guard in CalEnergyCalibInfoLink::addRow to require at least 3
columns before parsing, and throw a clear cet::exception otherwise.
Define cal.sourceenergycalib in DbService/data/create.sql with column
layout matching CalSourceEnergyCalib, including fr* fields and ndf,
plus primary key and grants.
@zwl0331
Copy link
Author

zwl0331 commented Mar 3, 2026

I'll review this pull request by first loading the PR details and then examining the code changes.Now let me look at the existing CalCosmicEnergyCalibInfo and CalSourceEnergyCalib base files for comparison, and the create.sql for the existing cal.energycalibinfo DDL:Now let me check the existing CalSourceEnergyCalib on main to compare:Now I have a thorough understanding of all the changed files. Here is my review:

PR #1718 Review — Issues Found

🔴 Issue 1: Missing SQL DDL for cal.sourceenergycalibndf column not added

The C++ code adds a new ndf column to CalSourceEnergyCalib (schema string, Row constructor, addRow, rowToCsv), but the SQL schema in create.sql does not add a corresponding ndf column to the cal.sourceenergycalib table. There is no existing DDL for sourceenergycalib in create.sql either — but if this table is managed by the val system and already deployed in production, an ALTER TABLE migration is needed. At minimum, if the database schema is supposed to match the C++ code, the ndf INTEGER column is missing.

🔴 Issue 2: Missing SQL DDL for cal.energycalibinfo table

The new CalEnergyCalibInfo class references "cal.energycalibinfo" as its db name, but no CREATE TABLE cal.energycalibinfo statement was added to create.sql. Only cal.energycalibinfolink was added. If this table is expected to be created via this DDL file, the definition is missing.

🟡 Issue 3: CalEnergyCalibInfoLink::addRow — no bounds check on columns size

  void addRow(const std::vector<std::string>& columns) override {
    _rows.emplace_back(std::stoi(columns[0]), std::stoi(columns[1]), columns[2]);
  }

If columns has fewer than 3 elements, this will produce an out-of-bounds access. Other tables in the codebase also lack this check, so it may be consistent with convention — but for a link table that stores free-form text (comment), it's more likely to encounter short rows. Consider adding a guard or at least defaulting comment to "" when columns.size() < 3.

🟡 Issue 4: CalEnergyCalibInfoLinkcomment field has the same unquoted-CSV problem that was fixed for CalEnergyCalibInfo

The comment field in CalEnergyCalibInfoLink is written and read as raw unquoted text:

  void rowToCsv(std::ostringstream& sstream, std::size_t irow) const override {
    Row const& r = _rows.at(irow);
    sstream << r.energyCalibCid() << ",";
    sstream << r.energyCalibInfoCid() << ",";
    sstream << r.comment();   // <-- not quoted
  }

If the comment text ever contains commas or double-quotes, CSV round-tripping will break — the exact problem that was fixed for CalEnergyCalibInfo::status_message in this same PR. The toCsvText/fromCsvText helpers should be reused here (or factored into a shared utility).

🟡 Issue 5: fromCsvText has an off-by-one boundary check that can drop a trailing escaped quote

  static std::string fromCsvText(const std::string& text) {
    // ...
    for (std::size_t i = 1; i + 1 < text.size(); ++i) {
      if (text[i] == '"' && i + 2 < text.size() && text[i + 1] == '"') {
        //                    ^^^^^^^^^^^^^^ should be i + 2 <= text.size() - 1

The condition i + 2 < text.size() means that when a doubled-quote "" appears right before the closing ", i.e. the sequence is ..."" followed by the final ", the check fails because i + 2 == text.size() - 1 is not < text.size(). This causes the inner " to be treated as a literal character rather than being unescaped. The same issue applies to the \\\" branch. The fix should be i + 1 < text.size() - 1 (or equivalently i + 2 < text.size() is actually fine for text[i+1] access, but the semantic intent is wrong — the loop already stops at i + 1 < text.size(), so position i+1 is the closing quote when i + 2 == text.size()).

Concrete example: Input "hello"" (7 chars, represents hello"). At i = 6, the loop stops because 6 + 1 < 7 is false. At i = 5, text[5] == '"' and 5 + 2 < 7 is false, so the doubled-quote at positions 5–6 is not recognized. Output: hello" — actually this specific case works because the raw char is " which is pushed. But for "ab""cd" (8 chars) at i = 3: text[3] == '"', 3 + 2 < 8 → true, works fine. Edge case """" (4 chars = escaped single ") at i = 1: text[1] == '"', 1 + 2 < 4 → true, works. So the boundary check is actually fine in practice for the "" escape. However, the \\\" branch has the same guard and is more suspect. Overall this is a minor concern.

🟡 Issue 6: CalCosmicEnergyCalibInfostd::move applied to by-value parameters that are already copies

    Row(int cID, int FirstCalibRun, int LastCalibRun, std::string EnergyMethod, std::string FitMethod, std::string Comment) :
        // ...
        _EnergyMethod(std::move(EnergyMethod)),
        _FitMethod(std::move(FitMethod)),
        _Comment(std::move(Comment)) {}

This is correct and beneficial — moving from by-value parameters avoids an extra copy. But the constructor signature still takes std::string by value (not by const reference), so callers that pass lvalues will copy into the parameter and then move. This is the idiomatic C++ pattern, so this change is fine. No issue here.

🔵 Issue 7 (Nit): CalEnergyCalibInfostatus_message() returns by value, not by const&

    std::string status_message() const { return _status_message; }

This copies the string on every call. Should be const std::string& for consistency and to avoid unnecessary allocations. The same pattern exists in CalEnergyCalibInfoLink::comment() and in the existing CalCosmicEnergyCalibInfo, so this may just be matching convention, but it's worth noting.

🔵 Issue 8 (Nit): CalEnergyCalibInfotoCsvText/fromCsvText are private helpers that could be shared

Since CalEnergyCalibInfoLink needs the same CSV quoting logic (see Issue 4), these should ideally be factored into a common base or utility, rather than being private static methods in one class.

Summary

Severity File Issue

1 🔴 High create.sql Missing ndf column for cal.sourceenergycalib
2 🔴 High create.sql Missing CREATE TABLE cal.energycalibinfo DDL
3 🟡 Medium CalEnergyCalibInfoLink.hh No bounds check on columns in addRow
4 🟡 Medium CalEnergyCalibInfoLink.hh comment field not CSV-quoted (same bug fixed elsewhere in this PR)
5 🟡 Low CalEnergyCalibInfo.hh fromCsvText boundary check may miss edge-case escaped quotes
6 ✅ OK CalCosmicEnergyCalibInfo.hh std::move change is correct
7 🔵 Nit CalEnergyCalibInfo.hh / CalEnergyCalibInfoLink.hh String accessors return by value, not const&
8 🔵 Nit CalEnergyCalibInfo.hh CSV quoting helpers should be shared/reusable
Issues #1 and #2 are the most critical — the SQL DDL does not match the C++ table definitions, which will cause runtime failures when the tables are created or queried against a fresh database.

Thanks for the detailed follow-up review — I addressed the concrete issues in this update.

Addressed:

  1. Missing cal.sourceenergycalib DDL / ndf:

    • Added CREATE TABLE cal.sourceenergycalib in DbService/data/create.sql with ndf, PK, and grants.
    • Commit: 85e842084.
  2. Missing cal.energycalibinfo DDL:

    • Added CREATE TABLE cal.energycalibinfo in DbService/data/create.sql with PK and grants.
    • Commit: bd8156f23.
  3. CalEnergyCalibInfoLink::addRow column bounds:

    • Added explicit columns.size() < 3 guard and throws CALENERGYCALIBINFOLINK_BAD_COLUMNS.
    • Commit: afc1f381f.
  4. CalEnergyCalibInfoLink CSV handling for comment:

    • Added CSV quoting/unquoting for comment and trailing-column join for robust round-trip.
    • Commit: 1d3f66ea4.

Not changed in this PR:
5. fromCsvText edge-case boundary concern:

  • Left unchanged for now (low-severity and behavior appears correct for current doubled-quote cases). I can adjust if you want this tightened explicitly.
  1. std::move in CalCosmicEnergyCalibInfo:

    • Agreed this is fine; no change needed.
  2. String getters returning by value:

    • Left as-is to stay consistent with existing table style in this code area.
  3. Sharing CSV helper utilities:

    • Kept helpers local in this PR to avoid broader refactor; can be done in follow-up if preferred.

Copy link
Collaborator

@giro94 giro94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything good to me!

I would wait for Ray's approval too.

Copy link
Collaborator

@rlcee rlcee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove create.sql from this PR, it not user maintained (copilot misfire)
I'd prefer to not name CalEnergyCalibInfo with the "Info" since we have a pattern that "Info" indicates a ad-hoc entry. What happened to CalCombinedEnergyCalib?
If you agree to the above, then CalEnergyCalibInfoLink can be CalEnergyCalibInfo and the pattern is restored.
It is surprising to plan for a string message for every channel, please confirm you are sure.
The status codes should be in an include where it can be accessed in a module or other places. I'd expect a new include in CaloConditions. Then move the description there.
In all the places you have std::move, please make the constructor argument "const std::string&" and remove the std::move.
Do not manage commas. The addRow will be parsed correctly and columns will have a length of 5, Please see https://mu2ewiki.fnal.gov/wiki/ConditionsData#strings
rowtoCsv is obsolete, but until that's enforced, just write a double quote at the start and end of message, and remove your text functions.

@zwl0331
Copy link
Author

zwl0331 commented Mar 4, 2026

Please remove create.sql from this PR, it not user maintained (copilot misfire) I'd prefer to not name CalEnergyCalibInfo with the "Info" since we have a pattern that "Info" indicates a ad-hoc entry. What happened to CalCombinedEnergyCalib? If you agree to the above, then CalEnergyCalibInfoLink can be CalEnergyCalibInfo and the pattern is restored. It is surprising to plan for a string message for every channel, please confirm you are sure. The status codes should be in an include where it can be accessed in a module or other places. I'd expect a new include in CaloConditions. Then move the description there. In all the places you have std::move, please make the constructor argument "const std::string&" and remove the std::move. Do not manage commas. The addRow will be parsed correctly and columns will have a length of 5, Please see https://mu2ewiki.fnal.gov/wiki/ConditionsData#strings rowtoCsv is obsolete, but until that's enforced, just write a double quote at the start and end of message, and remove your text functions.

Thanks for the detailed review. I pushed follow-up commits addressing the requested changes:

  1. Removed create.sql from this PR
  • be816de37
  1. Restored the naming pattern
  • Archive table is now CalCombinedEnergyCalib
  • The ad-hoc CID link table is now CalEnergyCalibInfo (energycalibcid,combinedenergycalibcid,comment)
  • 6a529fc2d
  1. Moved status codes to a shared include in CaloConditions
  • Added CaloConditions/inc/CalCombinedEnergyCalibStatus.hh and referenced it from the table code
  • 201fba9ce
  1. Replaced std::move pattern with const std::string& constructor args
  • 0888c5904
  1. Removed custom comma/text parsing helpers
  • addRow now reads columns directly
  • rowToCsv now writes quoted message/comment fields
  • 0ed747822

On the per-channel string field: yes, this is intentional for per-channel diagnostic/provenance context from the combination pass; status codes remain the canonical machine-readable field.

Could you please take another look when convenient?

@zwl0331 zwl0331 requested a review from rlcee March 4, 2026 06:55
@zwl0331 zwl0331 changed the title Update CalSourceEnergyCalib (add ndf + fix CSV serialization); add CalEnergyCalibInfo and CID link table Add CalCombinedEnergyCalib + CalEnergyCalibInfo CID link; update CalSourceEnergyCalib (ndf + CSV fix) Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants