Ma, Yanan
2010-06-21 23:00:40 UTC
Resend to new address at sourceforge.net
From: Ma, Yanan
Sent: Tuesday, June 15, 2010 2:49 PM
To: '***@objectsecurity.com'; 'mico-***@mico.org'
Cc: Ma, Yanan
Subject: mico bugs report
Hi,
The reported bugs were found in mico-2.3.12 but same problems in 2.3.13. Mico crashes when a cancellation is received against a previous request, the previous request could be done or still being processed by another thread.
The problem code is in orb/iop.cc (based on 2.3.12). The fix is highlighted in RED below.
1. At line # 4702: missing brackets '{' and '}', which resulted in always returning the first in queue, most likely not the one that matches.
2. At line # 4728: rec could be NULL and del_invoke_orbid(rec) should not be called here, it should be called by the calling function instead.
3. At line # 5228: del_invoke_orbid (rec) should be commented out to avoid crashes, which is related to the next one.
4. At line # 4723: get_request_hint( id ) could return a non-NULL pointer, but it's not a valid IIOPServerInvokeRec*. This can be triggered when a request is being processed by one thread and it takes a long time to finish, then a cancellation comes in and is process by another thread to remove the entry out of _orbids, then the request itself was finally completed and MICO::IIOPServer::handle_invoke_reply(CORBA::ORBMsgId id) is called, which will call pull_invoke_orbid ( id ) at line # 5248. Things are getting really messy here and it seems there is a design flaw. I don't have a good solution other than commenting line # 5228 out to avoid craches, this temporary fix works fine but I am hoping for a complete fix. There is an interesting comment in include/mico/orb_mico.h:
461 //FIXME: hopefully nobody is going to shot me for this :-)
462 void set_request_hint(ORBMsgId id, void *hint) {
463 ORBInvokeRec *rec = get_invoke (id);
464 assert( rec );
465 rec->set_request_hint( hint );
466 };
467 void *get_request_hint(ORBMsgId id) {
468 ORBInvokeRec *rec = get_invoke (id);
469 assert( rec );
470 return rec->get_request_hint();
471 };
4685 MICO::IIOPServerInvokeRec *
4686 MICO::IIOPServer::pull_invoke_reqid (MsgId msgid, GIOPConn *conn)
4687 {
4688 MICOMT::AutoLock l(_orbids_mutex);
4689
4690 #ifdef USE_IOP_CACHE
4691 if (_cache_used && _cache_rec->reqid() == msgid &&
4692 _cache_rec->conn() == conn) {
4693 _cache_rec->deactivate();
4694 return _cache_rec;
4695 }
4696 #endif
4697 // XXX slow, but only needed for cancel
4698
4699 IIOPServerInvokeRec *rec;
4700 for (MapIdConn::iterator i = _orbids.begin(); i != _orbids.end(); ++i) {
4701 rec = (*i).second;
4702 if (rec->reqid() == msgid && rec->conn() == conn && rec->active() ) { // mico crash bug: '{' missing and always return first in queue
4703 rec->deactivate();
4704 return rec; }
4705 }
4706 return 0;
4707 }
4708
4709 MICO::IIOPServerInvokeRec *
4710 MICO::IIOPServer::pull_invoke_orbid (CORBA::ORBMsgId id)
4711 {
4712 #ifdef USE_IOP_CACHE
4713 if (_cache_used && _cache_rec->orbid() == msgid) {
4714 _cache_rec->deactivate();
4715 return _cache_rec;
4716 }
4717 #endif
4718
4719 MICOMT::AutoLock l(_orbids_mutex);
4720
4721 MICO::IIOPServerInvokeRec *rec;
4722
4723 rec = (MICO::IIOPServerInvokeRec *)_orb->get_request_hint( id );
4724 if (rec && rec->active()){
4725 rec->deactivate();
4726 return rec;
4727 }
4728 //del_invoke_orbid(rec); // mico crash bug: rec could be NULL, del_invoke_orb should be called by the calling function when rec is not NULL.
4729 return 0;
4730 }
5186 MICO::IIOPServer::handle_cancel_request (GIOPConn *conn, GIOPInContext &in)
5187 {
5188 CORBA::ULong req_id;
5189
5190 if (!conn->codec()->get_cancel_request (in, req_id)) {
5191 if (MICO::Logger::IsLogged (MICO::Logger::GIOP)) {
5192 MICOMT::AutoDebugLock __lock;
5193 MICO::Logger::Stream (MICO::Logger::GIOP)
5194 << "GIOP: cannot decode CancelRequest from "
5195 << conn->transport()->peer()->stringify() << endl;
5196 }
5197 #ifdef HAVE_THREADS
5198 conn->active_deref();
5199 #endif // HAVE_THREADS
5200 conn_error (conn);
5201 return FALSE;
5202 }
5203
5204 if (MICO::Logger::IsLogged (MICO::Logger::GIOP)) {
5205 MICOMT::AutoDebugLock __lock;
5206 MICO::Logger::Stream (MICO::Logger::GIOP)
5207 << "GIOP: incoming CancelRequest from "
5208 << conn->transport()->peer()->stringify()
5209 << " for msgid " << req_id << endl;
5210 }
5211
5212 conn->cancel (req_id);
5213
5214 IIOPServerInvokeRec *rec = pull_invoke_reqid (req_id, conn);
5215 #ifdef HAVE_THREADS
5216 conn->active_deref();
5217 #endif // HAVE_THREADS
5218 if (!rec) {
5219 // request already finished or no such id
5220 return TRUE;
5221 }
5222 CORBA::ORB::MsgId orbid = rec->orbmsgid();
5223
5224 // maybe rec->conn() != conn ...
5225 // connection might be deleted during call to orb->cancel() ...
5226
5227 //FIXME: should we realy delete it here ????????????
5228 //del_invoke_orbid (rec); // This is really a problem that I don't have a good solution other than commenting it out to avoid crashes.
5229
5230 _orb->cancel (orbid);
5231
5232 // maybe the connection was closed inbetween: make do_read() break
5233 //return FALSE;
5234 // kcg: we need to return TRUE here, because of thread-per-connection
5235 // concurrency model, where we cannot break do_read when we are not
5236 // 100% sure that the connection is either closed or broken
5237 return TRUE;
5238 }
Yanan Ma
Quantitative Analyst
Stark Investments | Quantitative Research
3600 South Lake Drive | St. Francis, WI 53235
414.294.7756 | Fax: 414.294.7956
________________________________
This transmission contains information for the exclusive use of the intended recipient and may be privileged, confidential and/or otherwise protected from disclosure. Any unauthorized review or distribution is strictly prohibited. Our company is required to retain electronic mail messages, which may be produced at the request of regulators or in connection with litigation. Electronic messages cannot be guaranteed to be secure, timely or error-free. As such, we recommend that you do not send confidential information to us via electronic mail. This communication is for informational purposes only and is not an offer or solicitation to buy or sell any investment product. Any information regarding specific investment products is subject to change without notice. If you received this transmission in error, please notify the sender immediately by return e-mail and delete this message and any attachments from your system.
From: Ma, Yanan
Sent: Tuesday, June 15, 2010 2:49 PM
To: '***@objectsecurity.com'; 'mico-***@mico.org'
Cc: Ma, Yanan
Subject: mico bugs report
Hi,
The reported bugs were found in mico-2.3.12 but same problems in 2.3.13. Mico crashes when a cancellation is received against a previous request, the previous request could be done or still being processed by another thread.
The problem code is in orb/iop.cc (based on 2.3.12). The fix is highlighted in RED below.
1. At line # 4702: missing brackets '{' and '}', which resulted in always returning the first in queue, most likely not the one that matches.
2. At line # 4728: rec could be NULL and del_invoke_orbid(rec) should not be called here, it should be called by the calling function instead.
3. At line # 5228: del_invoke_orbid (rec) should be commented out to avoid crashes, which is related to the next one.
4. At line # 4723: get_request_hint( id ) could return a non-NULL pointer, but it's not a valid IIOPServerInvokeRec*. This can be triggered when a request is being processed by one thread and it takes a long time to finish, then a cancellation comes in and is process by another thread to remove the entry out of _orbids, then the request itself was finally completed and MICO::IIOPServer::handle_invoke_reply(CORBA::ORBMsgId id) is called, which will call pull_invoke_orbid ( id ) at line # 5248. Things are getting really messy here and it seems there is a design flaw. I don't have a good solution other than commenting line # 5228 out to avoid craches, this temporary fix works fine but I am hoping for a complete fix. There is an interesting comment in include/mico/orb_mico.h:
461 //FIXME: hopefully nobody is going to shot me for this :-)
462 void set_request_hint(ORBMsgId id, void *hint) {
463 ORBInvokeRec *rec = get_invoke (id);
464 assert( rec );
465 rec->set_request_hint( hint );
466 };
467 void *get_request_hint(ORBMsgId id) {
468 ORBInvokeRec *rec = get_invoke (id);
469 assert( rec );
470 return rec->get_request_hint();
471 };
4685 MICO::IIOPServerInvokeRec *
4686 MICO::IIOPServer::pull_invoke_reqid (MsgId msgid, GIOPConn *conn)
4687 {
4688 MICOMT::AutoLock l(_orbids_mutex);
4689
4690 #ifdef USE_IOP_CACHE
4691 if (_cache_used && _cache_rec->reqid() == msgid &&
4692 _cache_rec->conn() == conn) {
4693 _cache_rec->deactivate();
4694 return _cache_rec;
4695 }
4696 #endif
4697 // XXX slow, but only needed for cancel
4698
4699 IIOPServerInvokeRec *rec;
4700 for (MapIdConn::iterator i = _orbids.begin(); i != _orbids.end(); ++i) {
4701 rec = (*i).second;
4702 if (rec->reqid() == msgid && rec->conn() == conn && rec->active() ) { // mico crash bug: '{' missing and always return first in queue
4703 rec->deactivate();
4704 return rec; }
4705 }
4706 return 0;
4707 }
4708
4709 MICO::IIOPServerInvokeRec *
4710 MICO::IIOPServer::pull_invoke_orbid (CORBA::ORBMsgId id)
4711 {
4712 #ifdef USE_IOP_CACHE
4713 if (_cache_used && _cache_rec->orbid() == msgid) {
4714 _cache_rec->deactivate();
4715 return _cache_rec;
4716 }
4717 #endif
4718
4719 MICOMT::AutoLock l(_orbids_mutex);
4720
4721 MICO::IIOPServerInvokeRec *rec;
4722
4723 rec = (MICO::IIOPServerInvokeRec *)_orb->get_request_hint( id );
4724 if (rec && rec->active()){
4725 rec->deactivate();
4726 return rec;
4727 }
4728 //del_invoke_orbid(rec); // mico crash bug: rec could be NULL, del_invoke_orb should be called by the calling function when rec is not NULL.
4729 return 0;
4730 }
5186 MICO::IIOPServer::handle_cancel_request (GIOPConn *conn, GIOPInContext &in)
5187 {
5188 CORBA::ULong req_id;
5189
5190 if (!conn->codec()->get_cancel_request (in, req_id)) {
5191 if (MICO::Logger::IsLogged (MICO::Logger::GIOP)) {
5192 MICOMT::AutoDebugLock __lock;
5193 MICO::Logger::Stream (MICO::Logger::GIOP)
5194 << "GIOP: cannot decode CancelRequest from "
5195 << conn->transport()->peer()->stringify() << endl;
5196 }
5197 #ifdef HAVE_THREADS
5198 conn->active_deref();
5199 #endif // HAVE_THREADS
5200 conn_error (conn);
5201 return FALSE;
5202 }
5203
5204 if (MICO::Logger::IsLogged (MICO::Logger::GIOP)) {
5205 MICOMT::AutoDebugLock __lock;
5206 MICO::Logger::Stream (MICO::Logger::GIOP)
5207 << "GIOP: incoming CancelRequest from "
5208 << conn->transport()->peer()->stringify()
5209 << " for msgid " << req_id << endl;
5210 }
5211
5212 conn->cancel (req_id);
5213
5214 IIOPServerInvokeRec *rec = pull_invoke_reqid (req_id, conn);
5215 #ifdef HAVE_THREADS
5216 conn->active_deref();
5217 #endif // HAVE_THREADS
5218 if (!rec) {
5219 // request already finished or no such id
5220 return TRUE;
5221 }
5222 CORBA::ORB::MsgId orbid = rec->orbmsgid();
5223
5224 // maybe rec->conn() != conn ...
5225 // connection might be deleted during call to orb->cancel() ...
5226
5227 //FIXME: should we realy delete it here ????????????
5228 //del_invoke_orbid (rec); // This is really a problem that I don't have a good solution other than commenting it out to avoid crashes.
5229
5230 _orb->cancel (orbid);
5231
5232 // maybe the connection was closed inbetween: make do_read() break
5233 //return FALSE;
5234 // kcg: we need to return TRUE here, because of thread-per-connection
5235 // concurrency model, where we cannot break do_read when we are not
5236 // 100% sure that the connection is either closed or broken
5237 return TRUE;
5238 }
Yanan Ma
Quantitative Analyst
Stark Investments | Quantitative Research
3600 South Lake Drive | St. Francis, WI 53235
414.294.7756 | Fax: 414.294.7956
________________________________
This transmission contains information for the exclusive use of the intended recipient and may be privileged, confidential and/or otherwise protected from disclosure. Any unauthorized review or distribution is strictly prohibited. Our company is required to retain electronic mail messages, which may be produced at the request of regulators or in connection with litigation. Electronic messages cannot be guaranteed to be secure, timely or error-free. As such, we recommend that you do not send confidential information to us via electronic mail. This communication is for informational purposes only and is not an offer or solicitation to buy or sell any investment product. Any information regarding specific investment products is subject to change without notice. If you received this transmission in error, please notify the sender immediately by return e-mail and delete this message and any attachments from your system.