Forum Discussion

OHarb1's avatar
OHarb1
Icon for Occasional Contributor rankOccasional Contributor
5 years ago

Crash in quartus_pgm

We're doing regression testing on a build server and we need to program the FPGA using quartus_pgm.

However, we find that quartus_pgm is flaky.

Is there a chance that the Quartus developers could have a look at this to see where the problem might be?

There's plenty of information in the stack trace, it should be possible to determine what's going on.


I'm inclined to believe that quartus_pgm is missing a check on something that is failing. An intelligeble and actionable error message would be a huge improvement.

This is with Ubuntu 18.04 and Quartus 19.3

*** Fatal Error: Segment Violation at 0x4000
Module: quartus_pgm
Stack Trace:
    0xf8769: FBGEN_FRAME::get_total_instr_size() + 0x29 (pgm_fbgen)
    0xf931c: FBGEN_FRAME::get_frame_size() + 0x2c (pgm_fbgen)
    0xf17a4: FBGEN_DBLOCK::get_block_size() + 0x24 (pgm_fbgen)
   0x42110b: PGMIO_FBGEN_PROXY::transfer_to_element() + 0x7b (pgm_pgmio)
   0x427df9: PGMIO_FBGEN_PROXY::create_bitsteam() + 0x329 (pgm_pgmio)
   0x2c2cc8: PGMIO_F2P::create_bitstream(PGM_CHAIN_ELEMENT*, std::vector<std::string, std::allocator<std::string> >*, PGMIO_CCF*) + 0x148 (pgm_pgmio)
   0x284c14: PGM_CHAIN_ELEMENT::generate_bv_list(bool) + 0x104 (pgm_pgmio)
   0x28868d: PGM_CHAIN_ELEMENT::create_chain_element(PGM_CHAIN_ELEMENT*, bool, FIO_PATH*, bool, PGMIO_CONFIG_SCHEME, bool, bool) + 0xd2d (pgm_pgmio)
    0x232d9: PGME_PROGRAMMER::lookup_device(PGM_CHAIN_ELEMENT*, PGMIO_CONFIG_SCHEME, bool, bool, bool) + 0x29 (pgm_pgme)
    0x2175d: QPGM_FRAMEWORK::create_element(std::string, std::string, unsigned int, unsigned int) + 0x601 (quartus_pgm)
    0x23b91: QPGM_FRAMEWORK::process_operation(std::string*) + 0x1e93 (quartus_pgm)
    0x24cde: QPGM_FRAMEWORK::post_check_arguments() + 0x2d6 (quartus_pgm)
    0x1c08f: qexe_standard_main(QEXE_FRAMEWORK*, QEXE_OPTION_DEFINITION const**, int, char const**) + 0x1bc (comp_qexe)
    0x1fd97: qpgm_main(int, char const**) + 0x5e (quartus_pgm)
    0x40720: msg_main_thread(void*) + 0x10 (ccl_msg)
     0x602c: thr_final_wrapper + 0xc (ccl_thr)
    0x407df: msg_thread_wrapper(void* (*)(void*), void*) + 0x62 (ccl_msg)
     0xa559: mem_thread_wrapper(void* (*)(void*), void*) + 0x99 (ccl_mem)
     0x8f92: err_thread_wrapper(void* (*)(void*), void*) + 0x27 (ccl_err)
     0x63f2: thr_thread_wrapper + 0x15 (ccl_thr)
    0x427e2: msg_exe_main(int, char const**, int (*)(int, char const**)) + 0xa3 (ccl_msg)
    0x1fe21: main + 0x26 (quartus_pgm)
    0x270b3: __libc_start_main + 0xf3 (c.so.6)

7 Replies

  • JohnT_Altera's avatar
    JohnT_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,


    May I know how can you duplicate the issue? Could you provide the full message log?


    Are you able to duplicate the issue on other system?


    • OHarb1's avatar
      OHarb1
      Icon for Occasional Contributor rankOccasional Contributor

      You already have the stack trace, you should be able to inspect the source code to find the missing error check/message.

      Try running nios2-configure-sof with the cable detached, see if that reproduces the problem.

  • JohnT_Altera's avatar
    JohnT_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,


    If the cable is detach then I don't expect it to work. I would like to confirm if the setup is correct and it can be easily duplicate. The reason is that without the duplication we are not sure what is actually happening even with the Internal Error information.


    • OHarb1's avatar
      OHarb1
      Icon for Occasional Contributor rankOccasional Contributor

      The bug here isn't that it doesn't work, but that it crashes without a helpful error message.

      There should be a helpful error message without the cable attached, not a crash.

      This may seem like a small thing, but when using FPGAs in automated test setups(which *should* be two for the par in 2020), then the logs are all you have to determine what the problem is.


      Can you reproduce a non-helpful error message without a cable attached?

  • JohnT_Altera's avatar
    JohnT_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,


    Usually when you do not have the blaster connected then it will show error message that there is no blaster connected. It will be helpful if you are able to provide the guide to duplicate it. The reason is that I am not able to observed the internal error you mention


    • OHarb1's avatar
      OHarb1
      Icon for Occasional Contributor rankOccasional Contributor

      I know that, ideally, you'd want a reproduction procedure.

      However, the stack trace should give the developers plenty of clues to hunt down the problem by code-inspection, which is why they made the stack trace in the first place.

  • JohnT_Altera's avatar
    JohnT_Altera
    Icon for Regular Contributor rankRegular Contributor

    Hi,


    The internal error is just providing the some guidance but does not really fully root cause it. We will need to have a method to duplicate the issue so that we are able to fixed the issue correctly in case we implement the wrong solution.


    Sorry for the inconvenience.