Discussion:
Thread safety issue in krb5_verify_init_creds() when passing ccache pointer to null?
Thomas Sondergaard
2021-05-27 08:52:52 UTC
Permalink
Hi,

I have a multithreaded application that calls krb5_verify_init_creds() from
multiple threads. Each thread uses its own krb5_context. The code invoked
in each thread looks like something like this:

...
krb5_ccache cc = nullptr;
const auto err = krb5_verify_init_creds(context, &creds,
server_princ, kt, &cc, &opts);
krb5_principal princ;
krb5_cc_get_principal(context, &princ);
...
krb5_cc_close(context, cc);

The program occasionally crashes in the call to krb5_cc_get_principal. I
took a look at krb5_verify_init_creds() and believe I can see that when the
ccache argument is a pointer to null, it hits this code in get_vfy_cred(),
where ccache_arg is my cc above:

if (ccache_arg != NULL && ccache != NULL) {
if (*ccache_arg == NULL) {
ret = krb5_cc_resolve(context, "MEMORY:rd_req2", &retcc);
if (ret)
goto cleanup;
ret = krb5_cc_initialize(context, retcc, creds->client);
if (ret)
goto cleanup;
ret = copy_creds_except(context, ccache, retcc, creds->server);
if (ret)
goto cleanup;
*ccache_arg = retcc;
retcc = NULL;
} else {
ret = copy_creds_except(context, ccache, *ccache_arg, server);
}
}

If I understand this correctly I get a ticket cache "MEMORY:rd_req2", which
is local to the process. So when this function is called from multiple
threads with ccache being a pointer with the value NULL the threads will
stomp on the same ticket cache. Is this correct?

I had a look at the bug history too and found
https://krbdev.mit.edu/rt/Ticket/Display.html?id=3089. This ticket refers
to a thread safety issue related to another ticket cache "MEMORY:rd_req"
that krb5_verify_init_creds() used to use, but which has since been changed
to use krb5_cc_new_unique() in commit 17d690492927ad.

Shouldn't the ticket cache "MEMORY:rd_req2" in get_vfy_cred() be changed to
use krb5_cc_new_unique() too?

https://web.mit.edu/kerberos/krb5-1.18/doc/appdev/refs/api/krb5_verify_init_creds.html
doesn't say anything about who owns the ccache that
krb5_verify_init_creds() creates and returns. What is the proper protocol
for the caller when they are done with the ticket cache? Should they call
krb5_cc_close(context, cc); or krb5_cc_destroy(context, cc)? Will calling
just krb5_cc_close(context, cc) on a ccache created with
krb5_cc_new_unique() cause it to be leaked or will it automatically be
destroyed when all references to it have been closed?

Best regards,
Thomas
_______________________________________________
krbdev mailing list ***@mit.edu
https://mailman.mit.edu/mailman/listinfo/krbdev
Greg Hudson
2021-05-27 15:23:40 UTC
Permalink
Post by Thomas Sondergaard
krb5_ccache cc = nullptr;
const auto err = krb5_verify_init_creds(context, &creds,
server_princ, kt, &cc, &opts);
krb5_principal princ;
krb5_cc_get_principal(context, &princ);
...
krb5_cc_close(context, cc);
Out of curiosity, what does the code do with cc? This is the first time
I've heard of a caller using the credentials acquired by
krb5_verify_init_creds().
Post by Thomas Sondergaard
The program occasionally crashes in the call to krb5_cc_get_principal.
With what version of the krb5 libraries? The issue you identified below
is a concern, but a crash indicates a problem in the ccache layer.
Ticket #8202 (fixed in 1.17) addresses a bug that could explain the crash.
Post by Thomas Sondergaard
If I understand this correctly I get a ticket cache "MEMORY:rd_req2", which
is local to the process. So when this function is called from multiple
threads with ccache being a pointer with the value NULL the threads will
stomp on the same ticket cache. Is this correct?
It sounds like it. You could of course work around this issue by either
passing NULL if you don't need the credentials, or creating a ccache in
the caller.
Post by Thomas Sondergaard
Shouldn't the ticket cache "MEMORY:rd_req2" in get_vfy_cred() be changed to
use krb5_cc_new_unique() too?
[...]
Post by Thomas Sondergaard
Will calling
just krb5_cc_close(context, cc) on a ccache created with
krb5_cc_new_unique() cause it to be leaked or will it automatically be
destroyed when all references to it have been closed?
It will be leaked. (Well, the "leaked" objects would still be
theoretically accessible via the process-global memory ccache table, but
with names that nobody remembers.)

This concern may explain why the ticket #3089 fix didn't change this
code. Changing the code to call krb5_cc_new_unique() could, in theory,
create a memory ccache leak in a currently-working caller (one which
uses krb5_verify_init_creds() serially, of course, not in parallel like
your code).

If we choose to let the current code stand for that reason, we should of
course document the behavior and warn callers not to use this aspect of
the function concurrently.

Another option would be to create the notion of an anonymous memory
cache, which has no resolvable name and does get cleaned up when closed.
Heimdal recently (in Jan 2020) added this notion, making it happen when
"MEMORY:anonymous" is resolved--which is not strictly
backward-compatible, but presumably no one cares. (A minor
complication: unlike Heimdal, we have a krb5_cc_dup() function, the
implementation of which would currently break for an unresolvable ccache.)

_______________________________________________
krbdev mailing list ***@mit.edu
https://mailman.mit.edu/mailman/listinfo/krbdev
Thomas Sondergaard
2021-05-28 09:14:48 UTC
Permalink
Post by Greg Hudson
Post by Thomas Sondergaard
krb5_ccache cc = nullptr;
const auto err = krb5_verify_init_creds(context, &creds,
server_princ, kt, &cc, &opts);
krb5_principal princ;
krb5_cc_get_principal(context, &princ);
...
krb5_cc_close(context, cc);
Out of curiosity, what does the code do with cc? This is the first time
I've heard of a caller using the credentials acquired by
krb5_verify_init_creds().
It is a little curious. I didn't write the code, so I am not completely
sure why it is the way it is. The signature of function that contains this
code is:

ResultType verifyCredentials(std::string princ_name, std::string password,
std::string keytab_path, std::string server_principal)

It calls krb5_parse_name to create a principal from princ_name and then
krb5_get_init_creds_passwords to create a krb5_creds. It then calls
krb5_verify_init_creds(). If that checks out it then calls
krb5_cc_get_principal to dig the principal out of the returned krb5_ccache
and then calls krb5_unparse_name and returns that to the caller. Is there
any situation where this returned principal will be different from the one
the caller of the function provided?
Post by Greg Hudson
Post by Thomas Sondergaard
The program occasionally crashes in the call to krb5_cc_get_principal.
With what version of the krb5 libraries? The issue you identified below
is a concern, but a crash indicates a problem in the ccache layer.
Ticket #8202 (fixed in 1.17) addresses a bug that could explain the crash.
It is on CentOS 8.3.2011 with krb5-libs-1.18.2-5.el8.x86_64
Post by Greg Hudson
Post by Thomas Sondergaard
If I understand this correctly I get a ticket cache "MEMORY:rd_req2",
which
Post by Thomas Sondergaard
is local to the process. So when this function is called from multiple
threads with ccache being a pointer with the value NULL the threads will
stomp on the same ticket cache. Is this correct?
It sounds like it. You could of course work around this issue by either
passing NULL if you don't need the credentials, or creating a ccache in
the caller.
I'm going to to either create my own ccache - or just drop reading back the
default principal depending on the feedback I get to the question I asked
above.
Post by Greg Hudson
Post by Thomas Sondergaard
Shouldn't the ticket cache "MEMORY:rd_req2" in get_vfy_cred() be changed
to
Post by Thomas Sondergaard
use krb5_cc_new_unique() too?
[...]
Post by Thomas Sondergaard
Will calling
just krb5_cc_close(context, cc) on a ccache created with
krb5_cc_new_unique() cause it to be leaked or will it automatically be
destroyed when all references to it have been closed?
It will be leaked. (Well, the "leaked" objects would still be
theoretically accessible via the process-global memory ccache table, but
with names that nobody remembers.)
krb5_mcc_close(krb5_context context, krb5_ccache id) in
krb5/src/lib/krb5/ccache/cc_memory.c is the function that is called when
calling krb5_cc_close on a "MEMORY" type ccache, isn't it?
In krb5-1.82.2-final it looks like this:

krb5_error_code KRB5_CALLCONV
krb5_mcc_close(krb5_context context, krb5_ccache id)
{
krb5_mcc_data *d = id->data;
int count;
free(id);
k5_cc_mutex_lock(context, &d->lock);
count = --d->refcount;
k5_cc_mutex_unlock(context, &d->lock);
if (count == 0) {
/* This is the last active handle referencing d and d has been
removed
* from the table, so we can release it. */
empty_mcc_cache(context, d);
free(d->name);
k5_cc_mutex_destroy(&d->lock);
free(d);
}
return KRB5_OK;
}

When I call krb5_cc_close() on the only and therefore last handle to my
ccache created with krb5_cc_new_unique() I expect the refcount to reach 0
therefore calling empty_mcc_cache() to free the cache? So I was a little
unsure what is leaked. I also looked at krb5_mcc_destroy() in the same file
and it mentions removing the ccache from a table. Is it this empty table
entry that is leaked?

So the conclusion is that I should call krb5_cc_destroy() on the ccache
created with krb5_cc_new_unique() to get rid of it, right?

Thank you for your invaluable help!

Thomas
_______________________________________________
krbdev mailing list ***@mit.edu
https://mailman.mit.edu/mailman/listinfo/krbdev
Greg Hudson
2021-05-28 16:31:07 UTC
Permalink
Post by Thomas Sondergaard
It calls krb5_parse_name to create a principal from princ_name and then
krb5_get_init_creds_passwords to create a krb5_creds. It then calls
krb5_verify_init_creds(). If that checks out it then calls
krb5_cc_get_principal to dig the principal out of the returned krb5_ccache
and then calls krb5_unparse_name and returns that to the caller. Is there
any situation where this returned principal will be different from the one
the caller of the function provided?
There isn't. The returned ccache is initialized with creds->client
(vfy_increds.c:162), so reading its principal will just give that back
again.
Post by Thomas Sondergaard
Post by Greg Hudson
With what version of the krb5 libraries? The issue you identified below
is a concern, but a crash indicates a problem in the ccache layer.
Ticket #8202 (fixed in 1.17) addresses a bug that could explain the crash.
It is on CentOS 8.3.2011 with krb5-libs-1.18.2-5.el8.x86_64
In that case, if you have the ability to get a stack trace for the
crash, I'd be interested in seeing it.
Post by Thomas Sondergaard
krb5_mcc_close(krb5_context context, krb5_ccache id) in
krb5/src/lib/krb5/ccache/cc_memory.c is the function that is called when
calling krb5_cc_close on a "MEMORY" type ccache, isn't it?
Yes.
Post by Thomas Sondergaard
count = --d->refcount;
k5_cc_mutex_unlock(context, &d->lock);
if (count == 0) {
/* This is the last active handle referencing d and d has been
removed
* from the table, so we can release it. */
When I call krb5_cc_close() on the only and therefore last handle to my
ccache created with krb5_cc_new_unique() I expect the refcount to reach 0
therefore calling empty_mcc_cache() to free the cache?
No, the table gets its own reference. So when you create a new memory
ccache handle with krb5_cc_resolve(), it has a refcount of 2 (one for
the table, one for the handle). If you close it, it has a refcount of 1
(for the table). Only a krb5_cc_destroy() operation can allow the
refcount to hit 0 and release the cache object.
Post by Thomas Sondergaard
So I was a little
unsure what is leaked. I also looked at krb5_mcc_destroy() in the same file
and it mentions removing the ccache from a table. Is it this empty table
entry that is leaked?
It's not empty; the populated cache persists until all references to it
are removed, including the table's.
Post by Thomas Sondergaard
So the conclusion is that I should call krb5_cc_destroy() on the ccache
created with krb5_cc_new_unique() to get rid of it, right?
Yes. Otherwise the cache still lives in the table and, if anyone
remembers its name, resolvable in order to get at its contents.
_______________________________________________
krbdev mailing list ***@mit.edu
https://mailman.mit.edu/mailman/listinfo/krbdev
Thomas Sondergaard
2021-05-31 15:34:57 UTC
Permalink
Post by Greg Hudson
Post by Thomas Sondergaard
It calls krb5_parse_name to create a principal from princ_name and then
krb5_get_init_creds_passwords to create a krb5_creds. It then calls
krb5_verify_init_creds(). If that checks out it then calls
krb5_cc_get_principal to dig the principal out of the returned krb5_ccache
and then calls krb5_unparse_name and returns that to the caller. Is there
any situation where this returned principal will be different from the one
the caller of the function provided?
There isn't. The returned ccache is initialized with creds->client
(vfy_increds.c:162), so reading its principal will just give that back
again.
Excellent. That makes the bug irrelevant for me, as I am now just calling
krb5_get_init_creds_password() with a null pointer for ccache.
Post by Greg Hudson
Post by Thomas Sondergaard
Post by Greg Hudson
With what version of the krb5 libraries? The issue you identified below
is a concern, but a crash indicates a problem in the ccache layer.
Ticket #8202 (fixed in 1.17) addresses a bug that could explain the crash.
It is on CentOS 8.3.2011 with krb5-libs-1.18.2-5.el8.x86_64
In that case, if you have the ability to get a stack trace for the
crash, I'd be interested in seeing it.
For the next 80-90 days you can use this link:
https://sentry.io/share/issue/abfd015151f2493898af980fc94bee79/ where you
can see the stacktrace of the crashing thread and all the other threads.
The crashing thread itself has this stack. I don't think it is that
helpful. My function getDefaultPrincipal() calls krb5_cc_get_principal()
followed by krb5_unparse_name() to get the string representation of the
default principal.


PosixSignal: Signal 11 in node
File "<unknown>", in __memmove_sse2_unaligned_erms
File "/usr/include/bits/string_fortified.h", line 34, in memcpy
File
"/usr/src/debug/krb5-1.18.2-5.el8.x86_64/src/lib/krb5/krb/copy_data.c",
line 93, in krb5int_copy_data_contents_add0
File
"/usr/src/debug/krb5-1.18.2-5.el8.x86_64/src/lib/krb5/krb/copy_princ.c",
line 63, in krb5_copy_principal
File "<unknown>", in getDefaultPrincipal[abi:cxx11]
File "<unknown>", in verifyCredentials
File "<unknown>", in std::_Function_handler<T>::_M_invoke
File "<unknown>", in SimplePromiseWorker::Execute
File "<unknown>", in Napi::AsyncWorker::OnExecute
File "/home/iojs/build/ws/out/../deps/uv/src/threadpool.c", line 122, in
worker
File "<unknown>", in start_thread
File "<unknown>", in clone

Best regards,
Thomas
_______________________________________________
krbdev mailing list ***@mit.edu
https://mailman.mit.edu/mailman/listinfo/krbdev

Loading...